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■ Abstract 

' We present a randomized algorithm that, on input a symmetric, weakly diagonally dom- 

c/3 , inant n-by-n matrix A with m non-zero entries and an n- vector 6, produces an x such that 

i || £ — vltfo|| < e || 6 1| in expected time 

^t" ; mlog° (1) nlog(l/e). 

in 
o 



The algorithm applies subgraph preconditioncrs in a recursive fashion. These preconditioncrs 
improve upon the subgraph preconditioners first introduced by Vaidya (f990). For any 
[ — ■ symmetric, weakly diagonally-dominant matrix A with non-positive off-diagonal entries and 

f^) \ k > 1 , we construct in time Trilog *- 1 -* n a preconditioner of A with at most 

vo 

O ; 2(n - 1) + (m/k) log° (1) n 

O ■ non-zero off-diagonal entries such that the finite generalized condition number k f (A, B) is 

at most k. If the non-zero structure of the matrix is planar, then the condition number is at 
most 

rN I O {{n/k) log n log log 2 n) , 



and the corresponding linear system solver runs in expected time 

0(nlog 2 n + nlogn (log log n) 2 log(l/e)). 

Similar bounds are obtained on the running time of algorithms computing approximate 
Fiedler vectors. 



"This paper is the last in a sequence of three papers expanding on material that appeared first under the title 
"Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems" |ST04j . 
The second paper, "Spectral Sparsification of Graphs" [ST08c| contains algorithms for constructing sparsifiers 
of graphs, which we use in this paper to build preconditioners. The first paper, "A Local Clustering Algorithm 
for Massive Graphs and its Application to Nearly-Linear Time Graph Partitioning" |ST08bj contains graph 
partitioning algorithms that are used to construct sparsifiers in the second paper. 

This material is based upon work supported by the National Science Foundation under Grant Nos. 0325630, 
0324914, 0634957, 0635102 and 0707522. Any opinions, findings, and conclusions or recommendations expressed in 
this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. 
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1 Introduction 

We design an algorithm with nearly optimal asymptotic complexity for solving linear systems 
in symmetric, weakly diagonally dominant (SDDo) matrices. The algorithm applies a classical 
iterative solver, such as the Preconditioned Conjugate Gradient or the Preconditioned Chebyshev 
Method, with a novel preconditioner that we construct and analyze using techniques from graph 
theory. Linear systems in these preconditioners may be reduced to systems of smaller size in 
linear time by use of a direct method. The smaller linear systems are solved recursively. The 
resulting algorithm solves linear systems in SDDo matrices in time almost linear in their number 
of non-zero entries. Our analysis does not make any assumptions about the non-zero structure 
of the matrix, and thus may be applied to the solution of the systems in SDDo matrices that 
arise in any application, such as the solution of elliptic partial differential equations by the 
finite element method [Str86t lBHV04j . the solution of maximum flow problems by interior point 
algorithms [FG041 IDS08] . or the solution of learning problems on graphs [BMN04| |ZBL + Q3 
IZGL03] . 

Graph theory drives the construction of our preconditioners. Our algorithm is best un- 
derstood by first examining its behavior on Laplacian matrices — the symmetric matrices with 
non-positive off-diagonals and zero row sums. Each n-by-ra Laplacian matrix A may be associ- 
ated with a weighted graph, in which the weight of the edge between distinct vertices i and j is 
—Aij (see Figure [T|). We precondition the Laplacian matrix A of a graph G by the Laplacian 
matrix B of a subgraph H of G that resembles a spanning tree of G plus a few edges. The sub- 
graph H is called an ultra- spar sifier of G, and its corresponding Laplacian matrix is a very good 
preconditioner for A: The finite generalized condition number Kf(A, B) is log°^ n. Moreover, 
it is easy to solve linear equations in B. As the graph H resembles a tree plus a few edges, we 
may use partial Cholesky factorization to eliminate most of the rows and columns of B while 
incurring only a linear amount fill. We then solve the reduced system recursively. 




Figure 1: A Laplacian matrix and its corresponding weighted graph. 

The technical meat of this paper lies in the construction of ultra-sparsifiers for Laplacian 
matrices, which appears in Sections [7] through [TUl In the remainder of the introduction, we 
formally define ultra-sparsifiers, and the sparsifiers from which they are built. In Section [21 
we survey the contributions upon which we build, and mention other related work. We devote 
Section [3] to recalling the basics of support theory, defining the finite condition number, and 
explaining why we may restrict out attention to Laplacian matrices. 

In Section HJ we state the properties we require of partial Cholesky factorizations, and we 
present our first algorithms for solving equations in SDDo-matrices. These algorithms directly 
solve equations in the preconditioners, rather than using a recursive approach, and take time 
roughly 0(m 5 / 4 \og 0< ^ n) for general SDDo-matrices and 0(n 9//8 log 1//2 n) for SDDMo-matrices 
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with planar non-zero structure. To accelerate these algorithms, we apply our preconditioned 
in a recursive fashion. We analyze the complexity of these recursive algorithms in Section 
obtaining our main algorithmic results. In Section we observe that these linear system solvers 
yield efficient algorithms for computing approximate Fiedler vectors, when applied inside the 
inverse power method. 

We do not attempt to optimize the exponent of log n in the complexity of our algorithm. 
Rather, we present the simplest analysis we can find of an algorithm of complexity m log°^ n log(l/ e) . 
We expect that the exponent of log n can be substantially reduced through advances in the con- 
structions of low-stretch spanning trees, sparsifiers, and ultrasparsifiers. Experimental work is 
required to determine whether a variation of our algorithm will be useful in practice. 

1.1 Ultra-sparsifiers 

To describe the quality of our preconditioners, we employ the notation A ^ B to indicate that 
B — A is positive semi-definite. We define a SDDMo-matrix to be a SDDo-matrix with no 
positive off-diagonal entries. When positive definite, the SDDMo-matrices are M-matrices and 
in particular are Stieltjes matrices. 

Definition 1.1 (Ultra-Sparsifiers). A (k,h) -ultra- spar sifier of ann-by-n SDDMo-matrix A with 
2m non-zero off-diagonal entries is a SDDMo-matrix A s such that 

(a) A s 4 A ^ k-A 8 . 

(b) A s has at most 2(n — 1) + 2hm/k non-zero off-diagonal entries. 

(c) The set of non-zero entries of A s is a subset of the set of non-zero entries of A. 

In Section fT0| we present an expected m log ^ n-time algorithm that on input a Laplacian 
matrix A and a k > 1 produces a (k, /i)-ultra-sparsifier of A with probability at least 1 — l/2n, 
for 

h = c 3 log^ 4 n, (1) 

where C3 and C4 are some absolute constants. As we will use these ultra-sparsifiers throughout 
the paper, we will define a k- ultra- spar sifier to be a (k, /i)-ultra-sparsifier where h satisfies (pQ). 

For matrices whose graphs are planar, we present a simpler construction of (k, /^-ultra- 
sparsifiers, with h = O (logn(loglogn) 2 ). This simple constructions exploits low-stretch span- 
ning trees |AKPW95j IEEST08j IABN08] . and is presented in Section M Our construction of 
ultra-sparsifiers in Section [10] builds upon the simpler construction, but requires the use of 
sparsifiers. The following definition of sparsifiers will suffice for the purposes of this paper. 

Definition 1.2 (Sparsifiers). A d-sparsifier ofn-by-n SDDMo -matrix A is a SDDMo -matrix A s 
such that 

(a) A s 4 A 4 {5/4:) A s . 

(b) A s has at most dn non-zero off-diagonal entries. 

(c) The set of non-zero entries of A s is a subset of the set of non-zero entries of A. 
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(d) For all i, 

In a companion paper |ST08cj . we present a randomized algorithm Sparsif y2 that produces 
sparsifiers of Laplacian matrices in expected nearly-linear time. As explained in Section O this 
construction can trivially be extended to all SDDMo-matrices. 

Theorem 1.3 (Sparsification). On input an n x n Laplacian matrix A with 2m non-zero off- 
diagonal entries and a p > 0, Sparsify2 runs in expected time mlog(l/p) log 17 n and with 
probability at least 1 — p produces a c\ \og C2 {n/p)-sparsifier of A, for C2 = 30 and some absolute 
constant c\ > 1. 

We parameterize this theorem by the constants c\ and C2 as we believe that they can be 
substantially improved. In particular, Spielman and Srivastava |SS08j construct sparsifiers with 
C2 = 1, but these constructions require the solution of linear equations in Laplacian matrices, 
and so can not be used to help speed up the algorithms in this paper. Batson, Spielman and 
Srivastava [BSS09] have proved that there exist sparsifiers that satisfy conditions (a) through 
(c) of Definition 11.21 with ci = 0. 

2 Related Work 

In this section, we explain how our results relate to other rigorous asymptotic analyses of algo- 
rithms for solving systems of linear equations. For the most part, we restrict our attention to 
algorithms that make structural assumptions about their input matrices, rather than assump- 
tions about the origins of those matrices. 

Throughout our discussion, we consider an n-by-n matrix with m non-zero entries. When 
m is large relative to n and the matrix is arbitrary, the fastest algorithms for solving linear 
equations are those based on fast matrix multiplication |CW82j , which take time approximately 
0(n 2 ' 376 ). The fastest algorithm for solving general sparse positive semi-definite linear systems 
is the Conjugate Gradient. Used as a direct solver, it runs in time 0{mn) (see [TB971 Theo- 
rem 28.3]). Of course, this algorithm can be used to solve a system in an arbitrary matrix A in 
a similar amount of time by first multiplying both sides by A T . To the best of our knowledge, 
every faster algorithm requires additional properties of the input matrix. 

2.1 Special non-zero structure 

In the design and analysis of direct solvers, it is standard to represent the non-zero structure 
of a matrix A by an unweighted graph Ga that has an edge between vertices i ^ j if and only 
if Aij is non-zero (see |DER86j ). If this graph has special structure, there may be elimination 
orderings that accelerate direct solvers. If A is tri-diagonal, in which case Ga is a path, then a 
linear system in A can be solved in time 0(n). Similarly, when Ga is a tree a linear system in 
A by be solved in time 0(n) (see |DER86j ). 

If the graph of non-zero entries Ga is planar, one can use Generalized Nested Dissec- 
tion [Geo731 ILRT791 IGT87] to find an elimination ordering under which Cholesky factorization 
can be performed in time 0(ra L5 ) and produces factors with at most 0(n log n) non-zero entries. 
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We will exploit these results in our algorithms for solving planar linear systems in Section SJ 
We recall that a planar graph on n vertices has at most 3n — 6 edges (see |Har721 Corollary 11.1 
(c)]), so m < 6n. 

2.2 Subgraph Preconditioners 

Our work builds on a remarkable approach to solving linear systems in Laplacian matrices 
introduced by Vaidya |Vai90| . Vaidya demonstrated that a good preconditioner for a Laplacian 
matrix A can be found in the Laplacian matrix B of a subgraph of the graph corresponding to 
A. He then showed that one could bound the condition number of the preconditioned system by 
bounding the dilation and congestion of an embedding of the graph of A into the graph of B. By 
using preconditioners obtained by adding edges to maximum spanning trees, Vaidya developed 
an algorithm that finds e-approximate solutions to linear systems in SDDMo-matrices with at 
most d non-zero entries per row in time O^dn) 1 ' 75 log(l/e)). When the graph corresponding 
to A had special structure, such as having a bounded genus or avoiding certain minors, he 
obtained even faster algorithms. For example, his algorithm for solving planar systems runs in 
time 0((d7i) L2 log(l/e)). 

As Vaidya's paper was never published and his manuscript lacked many proofs, the task 
of formally working out his results fell to others. Much of its content appears in the thesis 
of his student, Anil Joshi |Jos97j . and a complete exposition along with many extensions was 
presented by Bern et. al. |BGH+06| . Gremban, Miller and Zagha [Gre96j IGMZ95] expl ain parts 
of Vaidya's paper as well as extend Vaidya's techniques. Among other results, they find ways of 
constructing preconditioners by adding vertices to the graphs. Maggs et. al. [MMP + 05] prove 
that this technique may be used to construct excellent preconditioners, but it is still not clear if 
they can be constructed efficiently. 

The machinery needed to apply Vaidya's techniques directly to matrices with positive off- 
diagonal elements is developed in [BCHT04] . An algebraic extension of Vaidya's techniques for 
bounding the condition number was presented by Boman and Hendrickson [BH03b] , and later 
used by them [BHOlj to prove that the low-stretch spanning trees constructed by Alon, Karp, 
Peleg, and West [AKPW95] , yield preconditioners for which the preconditioned system has con- 
dition number at most m 2°( v/lognloglogn ) . They thereby obtained a solver for symmetric diago- 
nally dominant linear systems that produces e-approximate solutions in time m l ^+ i l ) log(l/e). 
Through improvements in the construction of low-stretch spanning trees [EEST08, ABN08] and 
a careful analysis of the eigenvalue distribution of the preconditioned system, Spielman and 
Woo [SW09| show that when the Preconditioned Conjugate Gradient is applied with the best 
low-stretch spanning tree preconditioners, the resulting linear system solver takes time at most 
0(mn 1 ^ 3 log 1 / 2 n log(l/e)). The preconditioners in the present paper are formed by adding edges 
to these low-stretch spanning trees. 

The recursive application of subgraph preconditioners was pioneered in the work of Joshi |Jos97] 
and Reif [Rei98] . Reif [Rei98] showed how to recursively apply Vaidya's preconditioners to solve 
linear systems in SDDMo-matrices with planar non-zero structure and at most a constant num- 
ber of non-zeros per row in time 0(n 1+ ^ \og°^ (k(A) / e)) , for every f3 > 0. While Joshi's anal- 
ysis is numerically much cleaner, he only analyzes preconditioners for simple model problems. 
Our recursive scheme uses ideas from both these works, with some simplification. Koutis and 
Miller |KM07j have developed recursive algorithms that solve linear systems in SDDMo-matrices 
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with planar non-zero structure in time 0(n log(l/e)). 
2.3 Other families of matrices 

Subgraph preconditioners have been used to solve systems of linear equations from a few other 
families. 

Daitch and Spielman |DS08| have shown how to reduce the problem of solving linear equa- 
tions in symmetric Mo-matrices to the problem of solving linear equations in SDDMo-matrices, 
given a factorization of the Mo-matrix of width 2 |EGB05| . These matrices, with the required 
factorizations, arise in the solution of the generalized maximum flow problem by interior point 
algorithms. 

Shklarski and Toledo |ST08aj introduce an extension of support graph preconditioners, called 
fretsaw preconditioners, which are well suited to preconditioning finite element matrices. Daitch 
and Spielman |DS07j use these preconditioners to solve linear equations in the stiffness matrices 
of two-dimensional truss structures in time 0(n 5 / 4 log n log (1/e)). 

For linear equations that arise when solving elliptic partial differential equations, other tech- 
niques supply fast algorithms. For example, Multigrid methods may be proved correct when 
applied to the solution of some of these linear systems [BHMOl] . and Hierarchical Matrices run 
in nearly- linear time when the discretization is nice |BH03aj . Boman, Hendrickson, and Vavasis 
|BHV04j have shown that the problem of solving a large class of these linear systems may be 
reduced to that of solving diagonally-dominant systems. Thus, our algorithms may be applied 
to the solution of these systems. 

3 Background and Notation 

By logx, we mean the logarithm of x base 2, and by lnx the natural logarithm. 

We define SDDo to be the class of symmetric, weakly diagonally dominant matrices, and 
SDDMo to be the class of SDDo-matrices with non-positive off-diagonal entries. We define a 
Laplacian matrix to be a SDDMo-matrix with with zero row-sums. 

Throughout this paper, we define the A- norm by 

II^H^ = V x T Ax. 

3.1 Preconditioners 

For symmetric matrices A and B, we write 

A 4 B 

if B — A is positive semi-definite. We recall that if A is positive semi-definite and B is symmetric, 
then all eigenvalues of AB are real. For a matrix B, we let B^ denote the Moore-Penrose pseudo- 
inverse of B — that is the matrix with the same nullspace as B that acts as the inverse of B on 
its image. We will use the following propositions, whose proofs are elementary. 

Proposition 3.1. If A and B are positive semi-definite matrices such that for some a, j3 > 0, 

aA 4 B 4 j3 A 

then A and B have the same nullspace. 
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Proposition 3.2. If A and B are positive semi-definite matrices having the same nullspace and 
a > 0, then 

aA =4 B 

if and only if 

aB ] 4 A f . 

The following proposition notes the equivalence of two notions of preconditioning. This 
proposition is called the "Support Lemma" in [BGH + 06] and |Gre96j . and is implied by Theo- 
rem 10.1 of |Axe85| . We include a proof for completeness. 

Proposition 3.3. If A and B are symmetric matrices with the same nullspace and A is positive 
semi-definite, then all eigenvalues of AB^ lie between X m in and X max if and only if 

Xmin-B A =^ X max B. 

Proof. We first note that AB^ has the same eigenvalues as A 1 / 2 B^ A 1 / 2 . If for all x £ Image(A) 
we have 

X min x T x < x T A 1 l 2 B^A l l 2 x, 
then by setting z = A x l 2 x, we find that for all z G Image(A), 

which is equivalent to X m i n A^ =4 B^ and 

XminB =^ A. 

The other side is proved similarly. □ 

Following Bern et. al. jBGH + 06j . we define the finite generalized condition number Kf(A, B) 
of matrices A and B having the same nullspace to be the ratio of the largest to smallest non- 
zero eigenvalues AB^ . Proposition 13.31 tells us that \ m i n B =^ A ^ X max B implies kj(A,B) < 
Xmax I Xmin- One can use Kf(A, B) to bound the number of iterations taken by the Preconditioned 
Conjugate Gradient algorithm to solve linear systems in A when using B as a preconditioner. 
Given bounds on X max and X m i n , one can similarly bound the complexity of the Preconditioned 
Chebyshev method. 

3.2 Laplacians Suffice 

When constructing preconditioners, we will focus our attention on the problem of preconditioning 
Laplacian matrices. 

Bern et. al. |BGH + 06l Lemma 2.5], observe that the problem of preconditioning SDDMo- 
matrices is easily reduced to that of preconditioning Laplacian matrices. We recall the reduction 
for completeness. 

Proposition 3.4. Let A be a SDDMo-matrix. Then, A can be expressed as A = Al + Ad where 
Al is a Laplacian matrix and Ad is a diagonal matrix with non-negative entries. Moreover, 
if Bl is a Laplacian matrix such that Al =<I Bl, then A =^ Bl + Ad. Similarly, if Bl is a 
Laplacian matrix such that Bl =4 Al, then Bl + Ad =<; A. 
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So, any algorithm for constructing sparsifiers or ultra-sparsifiers for Laplacian matrices can 
immediately be converted into an algorithm for constructing sparsifiers or ultra-sparsifiers of 
SDDMo-matrices. Accordingly in Sections [U] and [TU] we will restrict our attention to the problem 
of preconditioning Laplacian matrices. 

Recall that a symmetric matrix A is reducible if there is a permutation matrix P for which 
P T AP is a block-diagonal matrix with at least two blocks. If such a permutation exists, one 
can find it in linear time. A matrix that is not reducible is said to be irreducible. The problem 
of solving a linear system in a reducible matrix can be reduced to the problems of solving linear 
systems in each of the blocks. Throughout the rest of this paper, we will restrict our attention 
to solving linear systems in irreducible matrices. It is well-known that a symmetric matrix is 
irreducible if and only if its corresponding graph of non-zero entries is connected. We use this 
fact in the special case of Laplacian matrices, observing that the weighted graph associated with 
a Laplacian matrix A has the same set of edges as Ga- 

Proposition 3.5. A Laplacian matrix is irreducible if and only if its corresponding weighted 
graph is connected. 

It is also well-known that the null-space of the Laplacian matrix of a connected graph is the 
span of the all-l's vector. Combining this fact with Proposition 13.41 one can show that the only 
singular irreducible SDDMo-matrices are the Laplacian matrices. 

Proposition 3.6. A singular irreducible SDDMo -matrix is a Laplacian matrix, and its nullspace 
is spanned by the all-l's vector. 

To the extent possible, we will describe our algorithms for solving irreducible singular and 
non-singular systems similarly. The one tool that we use for which this requires some thought is 
the Cholesky factorization. As the Cholesky factorization of a Laplacian matrix is degenerate, it 
is not immediately clear that one can use backwards and forwards substitutions on the Cholesky 
factors to solve a system in a Laplacian. To handle this technicality, we note that an irreducible 
Laplacian matrix A has a factorization of the form 

A = LDL T , 

where L is lower-triangular and non-zero on its entire diagonal and D is a diagonal matrix with 
ones on each diagonal entry, excluding the bottom right-most which is a zero. This factorization 
may be computed by a slight modification of standard Cholesky factorization algorithms. The 
pseudo-inverse of A can be written 

A ] = UL^DL^U, 

where II is the projection orthogonal to the all-l's vector (see Appendix |D|) . 

When A is a Laplacian and we refer to forwards or backwards substitution on its Cholesky 
factors, we will mean multiplying by DL~ l Ii or UL~ T D, respectively, and remark that these 
operations can be performed in time proportional to the number of non-zero entries in L. 

4 Solvers 

We first note that by Gremban's reduction, the problem of solving an equation of the form 
Ax = b for a SDDg-matrix A can be reduced to the problem of solving a system that is twice 
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as large in a SDDMo-matrix (see Appendix [A]). So, for the purposes of asymptotic complexity, 
we need only consider the problem of solving systems in SDDMo-matrices. 

To solve systems in an irreducible SDDMo-matrix A, we will compute an ultra-sparsifier 
B of A, and then solve the system in A using a preconditioned iterative method. At each 
iteration of this method, we will need to solve a system in B. We will solve a system in B 
by a two-step algorithm. We will first apply Cholesky factorization repeatedly to eliminate 
all rows and columns with at most one or two non-zero off-diagonal entries. As we stop the 
Cholesky factorization before it has factored the entire matrix, we call this process a partial 
Cholesky factorization. We then apply another solver on the remaining system. In this section, 
we analyze the use of a direct solver. In Section [5l we obtain our fastest algorithms by solving 
the remaining system recursively. 

The application of partial Cholesky factorization to eliminate rows and columns with at most 
2 non-zero off-diagonal entries results in a factorization of B of the form 

B = PLCL T P T , 

where C has the form 



C 



In—ri\ 



Ax, 

P is a permutation matrix, L is non-singular and lower triangular of the form 

V L 2,l inn 

and every row and column of A\ has at least 3 non-zero off-diagonal entries. 

We will exploit the properties of this factorization stated in the following proposition. 

Proposition 4.1 (Partial Cholesky Factorization). If B is an irreducible SDDMo-maira; then, 

(a) A\ is an irreducible SDDMo-matrix and is singular if and only if A is singular. 

(b) If the graph of non-zero entries of B is planar, then the graph of non-zero entries of A\ is 
as well. 

(c) L has at most 3n non-zero entries. 

(d) If B has 2(n — 1 + j) non-zero off-diagonal entries, then A\ has dimension at most 2j — 2 
and has at most 2(3j — 3) non-zero off-diagonal entries. 

(e) B^ = HP~ T L~ T f ^ n_n i ^ t ^ L~ 1 P~ 1 H, where II is the projection onto the span of B. 

Proof. It is routine to verify that A\ is diagonally dominant with non-positive off-diagonal 
entries, and that planarity is preserved by elimination of rows and columns with 2 or 3 non-zero 
entries, as these correspond to vertices of degree 1 or 2 in the graph of non-zero entries. It is 
similarly routine to observe that these eliminations preserve irreducibility and singularity. 

To bound the number of entries in L, we note that for each row and column with 1 non-zero 
off-diagonal entry that is eliminated, the corresponding column in L has 2 non-zero entries, 



9 



and that for each row and column with 2 non-zero off-diagonal entries that is eliminated, the 
corresponding column in L has 3 non-zero entries. 

To bound n\, the dimension of A\, first observe that the elimination of a row and column 
with 1 or 2 non-zero off-diagonal entries decreases both the dimension by 1 and the number of 
non-zero entries by 2. So, A\ will have 2(m — 1 + j) non-zero off-diagonal entries. As each row 
in A\ has at least 3 non-zero off-diagonal entries, we have 

2(ni - 1+j) > 3m, 

which implies n\ < 2j — 2. The bound on the number non-zero off-diagonal entries in A\ follows 
immediately. 

Finally, ([5]) may be proved by verifying that the formula given for satisfies all the axioms 
of the pseudo-inverse (which we do in Appendix [D]) . □ 

We name the algorithm that performs this factorization PartialChol, and invoke it with 
the syntax 

(P,L,^i) = PartialChol^). 
We remark that PartialChol can be implemented to run in linear time. 



4.1 One-Level Algorithms 

Before analyzing the algorithm in which we solve systems in A\ recursively, we pause to examine 
the complexity of an algorithm that applies a direct solver to systems in A±. While the results 
in this subsection are not necessary for the main claims of our paper, we hope they will provide 
intuition. 

If we are willing to ignore numerical issues, we may apply the conjugate gradient algorithm 
to directly solve systems in A\ in 0(n\m\) operations |TB971 Theorem 28.3], where m\ is the 
number of non-zero entries in A\. In the following theorem, we examine the performance of the 
resulting algorithm. 

Theorem 4.2 (General One-Level Algorithm). Let A be an irreducible n-by-n SDDMo-matrix 
with 2m non-zero off-diagonal entries. Let B be a yjm-ultra-sparsifier of A. Let (P,L,A\) = 
PartialChol(i^). Consider the algorithm that solves systems in A by applying PCG with B as a 
preconditioner, and solves each system in B by a performing backward substitution on its partial 
Cholesky factor, solving the inner system in A\ by conjugate gradient used as an exact solver, 
and performing forward substitution on its partial Cholesky factor. Then for every right-hand 
side b, after 

0(m 1 / 4 log(l/e)) 

iterations, comprising 

0(m^ 4 log 2c4 nlog(l/e)) 
arithmetic operations, the algorithm will output an approximate solution x satisfying 



x 



A ] b 



< e 



A ] b 



(2) 



10 



Proof. As Kf(A,B) < y/m, we may apply the standard analysis of PCG [Axe85j . to show that 
([2|) will be satisfied after 0(m 1//4 log (1/e)) iterations. To bound the number of operations in 
each iteration, note that B has at most 2(n — 1) + 0{^/rn\og Ci n) non-zero off-diagonal entries. 
So, Proposition 14. II implies ni\ and n\ are both 0(y/m \og Ci n). Thus, the time required to solve 
each inner system in A\ is at most 0{m\ni) = 0(m log 2c4 n). As A is irreducible m > n — 1, so 
this bounds the number of operations that must be performed in each iteration. □ 

If m is much greater than n, we could speed up this algorithm by first applying Sparsif y2 
to compute a very good sparse preconditioner A s for A, using the one-level algorithm to solve 
systems in A s , and then applying this solver to A by iterative refinement. 

When the graph of non-zero entries of A is planar, we may precondition using the the 
algorithm UltraSimple, presented in Section [UJ instead of UltraSparsif y. As the matrix 
Ai produced by applying partial Cholesky factorization to the output of UltraSimple is also 
planar, we can solve the linear systems in A± by the generalized nested dissection algorithm of 
Lipton, Rose and Tarjan |LRT79j . This algorithm uses graph separators to choose a good order 

3/2 

for Cholesky factorization. The Cholesky factorization is then computed in time 0{n^ ). The 
resulting Cholesky factors only have O(nilogni) non-zero entries, and so each linear system in 
A\ may be solved in time 0{n\ logni), after the Cholesky factors have been computed. 

Theorem 4.3 (Planar One-Level Algorithm). Let A be an n-by-n planar SDDMo-matrix with 
m non-zero entries. Consider the algorithm that solves linear systems in A by using PCG with 
the preconditioner 

B = UltraSimple^, n 3/4 log 1/3 n), 

and solves systems in B by applying PartialChol to factor B into P L[1 ,0;0, Ai]L T P T , and 
uses generalized nested dissection to solve systems in A\. For every right-hand side b, this 
algorithm computes an x satisfying 



- A^b < e A ] b 

A 



(3) 



in time 

O (n 9 / 8 log 1/2 nlog(l/e) 



Proof. First, recall that the planarity of A implies m < 3n. Thus, the time taken by UltraSimple 
is dominated by the time taken by LowStretch, which is 0(n log 2 n). 

By Theorem 19. II and Theorem[93J the matrix B has at most 2(n— l) + 6n 3 / 4 log 1 / 3 n non-zero 
off-diagonal entries and 



K f ( A, B) = O (n 1/4 log 2/3 n log 2 log nj < O (n 1/4 log \ 
Again, standard analysis of PCG [Axe 85] tells us that the algorithm will require at most 

O (n 1 / 8 log 1 / 2 nlog(l/e) N 
iterations guarantee that ([3|) is satisfied. 
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By Proposition 14.11 the dimension of A±, ni, is at most 

6 n 3/4 log i/3 n Before be ginning to 

solve the linear system, the algorithm will spend 

0(nl /2 ) = 0((n 3 / 4 log 1 / 3 nf' 2 ) = 0(n 9 / 8 log 1 / 2 n) 

time using generalized nested dissection |LRT79| to permute and Cholesky factor the matrix A±. 
As the factors obtained will have at most O(nilogni) < 0(n) non-zeros, each iteration of the 
PCG will require at most 0(n) steps. So, the total complexity of the application of the PCG 
will be 

O (n • (n 1 / 8 log 1 / 2 n log(l/e)) ) = O (n 9 / 8 log 1 / 2 n log(l/e)) , 

which dominates the time required to compute the Cholesky factors and the time of the call to 
UltraSimple. □ 



5 The Recursive Solver 

In our recursive algorithm for solving linear equations, we solve linear equations in a matrix A 
by computing an ultra-sparsifier B, using partial Cholesky factorization to reduce it to a matrix 
Ai, and then solving the system in A\ recursively. Of course, we compute all of the necessary 
ultra-sparsifiers and Cholesky factorizations just once at the beginning of the algorithm. 
To specify the recursive algorithm for an n-by-n matrix, we first set the parameters 

X = c 3 log C4 n, (4) 

and 

A;=(14 X + 1) 2 , (5) 

where we recall that C3 and C4 are determined by the quality of the ultra-sparsifiers we can 
compute (see equation ([T])), and were used to define a fc-ultra-sparsifier. 

We the following algorithm BuildPreconditioners to build the sequence of preconditioners 
and Cholesky factors. In Section \10\ we define the routine UltraSparsif y for weighted graphs, 
and thus implicitly for Laplacian matrices. For general irreducible SDDMo-matrices A, we 
express A as a sum of matrices A^ and Ajj as explained in Proposition 13.41 and return Ap plus 
the ultra-sparsifier of the Laplacian matrix A^. 
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BuildPreconditioners(Ao), 

1. i = 0. 

2. Repeat 

(a) i = i + 1. 

(b) Pj = UltraSparsify(Ai_i, /c). 

(c) (Pi, L is Aj) = partialChol(5i). 

(d) Set Ilj to be the projection onto the span of B^. 

Until Ai has dimension less than 66x + 6. 

3. Set£ = i. 

4. Compute Z£ = Ag} . 

We now make a few observations about the sequence of matrices this algorithm generates. 
In the following, we let noff (A) denote the number of non-zero off-diagonal entries in the upper- 
triangular portion of A, and let dim (A) denote the dimension of A. 

Proposition 5.1 (Recursive Preconditioning). If Aq is a symmetric, irreducible, SDDMo- 
matrix, and for each i the matrix Bi is a k -ultra- spar sifier of Ai, then 

(a) For i > I, noff (Ai) < (3x/fc)noff (A-i)- 

(b) Fori > 1, dim (A) < (2x/Jfe)naff (A_i). 

(c) For i > I, dim (Pj) = dim (A-l)- 

(d) Each of Bi and Ai is an irreducible SDDMq -matrix. 

(e) Each Ai and Bi is a Laplacian matrix if and only if Aq is as well. 

(/) If Aq is a Laplacian matrix, then each Ilj is a projection orthogonal to the all-1 's vector. 
Otherwise, each Ilj is the identity. 

Proof. Let rii be the dimension of A4. Definition 11.11 tells us that 

noff (Pj) < n - 1 + /moff (Ai) /k = n-l + noff (Ai) x/k. 

Parts (a), (b), (d) and (e) now follow from Proposition 14. 1[ Part (c) is obvious, and part (/) 
follows from Proposition 13.61 □ 

Our recursive solver will use each matrix Pj as a preconditioner for A^\. But rather than 
solve systems in Pj directly, it will reduce these to systems in Ai, which will in turn be solved 
recursively. Our solver will use the preconditioned Chebyshev method, instead of the precon- 
ditioned conjugate gradient. This choice is dictated by the requirements of our analysis rather 
than by common sense. Our preconditioned Chebyshev method will not take the preconditioner 
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Bi as input. Rather, it will take a subroutine solve^. that produces approximate solutions to 
systems in Bi. So that we can guarantee that our solvers will be linear operators, we will fix 
the number of iterations that each will perform, as opposed to allowing them to terminate upon 
finding a sufficiently good solution. While this trick is necessary for our analysis, it may also be 
unnecessary in practice^- 

For concreteness, we present pseudocode for the variant of the preconditioned Chebyshev 
algorithm that we will use. It is a modification of the pseudocode presented in [BBC + 94"1 page 
36], the difference being that it takes as input a parameter t determining the number of iterations 
it executes (and some variable names have been changed). 



x = precondCheby(A, b, t, /(•), \ min , \ ma x) 

(0) Set x = 0. 

(1) r = b 

(2) d — {^max ^mm)/2, C — (^max ^rain)/2 

(3) for % = 1, . . . ,t, 

(a) z = f(r) 

(b) if i = 1, 

x = z 
a = 2/d 

else, 

= (ca/2) 2 
a = l/(d-P) 
x = z + fix 

(c) X = X + OLX 

(d) r = b - Ax 



Proposition 5.2 (Linear Chebyshev). Let A be a positive semi-definite matrix and f be a 
positive semi-definite, symmetric linear operator such that 

^minf^ ^4 A =^ y^maxf^ • (6) 

Let e < 1 and let 

(7) 

Then, the function precondCheby( J 4, b, t, f, \ m in, ^max) is a symmetric linear operator in b with 
the same nullspace as A. Moreover, if Z is the matrix realizing this operator, then 

One could obtain a slightly weaker analysis of this algorithm if one instead allowed the Chebyshev solvers to 
terminate as soon as they found a sufficiently accurate solution. In an early version of this paper, we analyzed such 
an algorithm using the analysis of the inexact preconditioned Chebyshev iteration by Golub and Overton |GQ88| . 
This analysis was improved by applying a slight extension by Joshi |Jos97| of Golub and Overton's analysis. The 
idea of bypassing these analysis by forcing our solvers to be linear operators was suggested to us by Vladimir 
Rokhlin. 



t> I./^£i n . 2 



2 V ^rn.in € 
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Proof. By Proposition 13.11 condition ([6]) implies that / and A have the same nullspace. An 
inspection of the pseudo-code reveals that the function computed by precondCheby can be 
expressed as a sum of monomials of the form f(Af) 1 , from which it follows that this function is 
a symmetric linear operator having the same nullspace as A. Let Z be the matrix realizing this 
operator. 

Standard analyses of the preconditioned Chebyshev algorithm |Axe85l Section 5.3] imply 
that for all b in the range of A, 



Zb-A ] b 



< e 



Now, consider any non-zero eigenvalue A and eigenvector b of AZ, so that 

AZb = Xb. 

Multiplying on the left by A^ and using the fact that Z and A have the same nullspace, we 
obtain 

Zb = XA^b. 



Plugging this into the previous inequality, we find 



A ] b 



> 



IA- II 



A ] b 



Zb-A ] b 

and so A must lie between 1 — e and 1 + e. Applying Proposition 13.31 we obtain 

(l-e)zt 4A^ (l + e)tf. 

We can now state the subroutine solve^ for i = !,...,£. 



□ 



x = solveBi(b) 






1. Set X m i n = 1 — 2e -2 , X max = (1 + 2e~ 2 )k and t = 


1.33V* 




2. Set s = LT x PT x Uib. 






3. Write s = ( S ° | , where the dimension of si is the size of As. 


4. Set y Q = so, and 






(a) if i = £, set y 1 = Zi&\ 






(b) else, set y 1 = precondCheby( J 4 i , si, solve Bi+1 , t, X min , 


Xmax ) • 


5. Setx = n i P i r T Lr T (^ jjo ^ 







We have chosen the parameters A m j n , X ma x, arid t so that inequality (|7|) holds for for e = 2e 2 . 
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We note that we apply L i and L i by forward and backward substitution, rather than by 
constructing the inverses. Similarly IT may be applied in time proportional to the length of 6 as 
it is either the identity, or the operator that orthogonalizes with respect to the all-l's vector. We 
remark that the multiplications by IT are actually unnecessary in our code, as solve^ will only 
appear inside a call to precondCheby, in which case it is multiplied on either side by matrices 
that implicitly contain Ilj. However, our analysis is simpler if we include these applications of 

n*. 

Lemma 5.3 (Correctness of solve^). If A is an irreducible SDDMo-matrix and Bi =<! Ai^i =^ 
kB{ for all i > 1, then for 1 < i < £, 

(a) The function solves,, is a symmetric linear operator. 

(b) The function precondCheby(Aj_i, b, solve^, t, A m j n , \ m ax) is a symmetric linear operator 
in b. 

(c) 

(1 - 2e~ 2 )Zj ^ Ai ^ (1 + 2e" 2 )Z 4 t, 
where for i < I — 1, Zi is the matrix such that 



ZiSi = precondCheby(yl i , si, solve^ +1 ,t, X min 

(d) 

(1 - 2e^ 2 )solve Bi t ^ B, ^ (1 + 2e^ 2 )solve i?i t . 

Proof. We first prove (a) and (6) by reverse induction on i. The base case of our induction is 
when i = I, in which case BuildPreconditioners sets Zi = Ag* , and so 



solves = U e P e ~ T Lf ( J ° J LfP^U,, 



which is obviously a symmetric linear operator. Given that solves, is a symmetric linear 
operator, part (b) for ^4j_i follows from Proposition 15.21 Given that (6) holds for Ai and that 
the call to precondCheby is realized by a symmetric matrix Zi, we then have that 

solve- = UiPr T L7 T ( [ I ) L^pr^ 



Zi 

is a symmetric linear operator. We may thereby establish that (a) and (b) hold for all 1 > i > I. 

We now prove properties (c) and (d), again by reverse induction. By construction Zi = Ag} , 
so (c) holds for i = I. To see that if (c) holds for i, then (d) does also, note that 

(1 - 2e" 2 )Zj t Ai implies 
(1 — 2e~ 2 )A^ =<! Zi, by Proposition 13.21 which implies 

(1 " 2e ~ 2) (o %) 4 (o %) which implies 
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(1 - 2e~ 2 )Bj = (i _ 2e~ 2 )Il l p- T L- T ( J ° f ^ L^i^Hi (by Proposition O (e)) 

* n*JT T £- T ( o Zi ) L * rlj ^ ln * = solveB - 

which by Proposition l3.2l implies (1— 2e~ 2 )solveR.t =^ Bi. The inequality Bi (l+2e _2 )solves^ 
may be established similarly. 

To show that when (d) holds for i then (c) holds for i — note that (d) and Bi =4 A-i =^ k-Bi 
imply 

(1 - 2e~ 2 )solve Bi t A-i 4 k(l + 2e~ 2 )solve Bi t . 

So, (c) for i — 1 now follows from Proposition 15.21 and the fact that A m j n , X max and t have been 
chosen so that inequality ([7]) is satisfied with e = 2e -2 . □ 

Lemma 5.4 (Complexity of solve^J. If Aq is an irreducible, n-by-n SDDMo-matrix with 2m 
non-zero off-diagonal entries and each Bi is a k -ultra- spar sifier of A— 1> then solve^ runs in 
time 

0(n + m). 

Proof. Let Tj denote the running time of solve^. We will prove by reverse induction on i that 
there exists a constant c such that 

Ti < c (dim (Bi) + ( 7X + <5)(noff (A) + dim (A*))) , (8) 

where 

7 = 196 and 5 = 15. 
This will prove the lemma as dim (Si) = dim(Ao) = n, and Proposition 15.11 implies 

( 7X + <5)(noff (A) + dim (A)) < ( 7 X + <5)^ < = 0(m). 

To prove (JSj) , we note that there exists a constant c so that steps [2] and [5] take time at most 
c(dim(5j)) (by Proposition 14. ip . step Ha] takes time at most c(dim(A) 2 )> and step I4bl takes 
time at most t(c ■ dim (A) + c • noff (A) + Ji+i), where i is as denned on stepCQof solve^. 

The base case of our induction will be i = I, in which case the preceding analysis implies 

T e <c (dim (B e ) + dim (A) 2 

< c (dim (i?^) + (66% + 6)dim (A^)) , (by step [2] of BuildPreconditioners) 

which satisfies (|8j). We now prove ([8]) is true for i < £, assuming it is true for i + 1. We have 

T,<c (dim (Bi)) + t(c • dim (A) + c • noff (A) + T i+1 ) 

< c [dim (Bi) + t(dim (A) + noff (A) + dim (B i+1 ) + ( 7X + 5) (noff (A+i) + dim (A+i)))] 

(by the induction hypothesis) 

< c [dim (B^ + t(2 dim (A) + noff (A) + ( 7 X + S)(5 noff (A) x/k))] 
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(by Proposition 15. ip 

< c [dim (Bi) + t (2 dim (Ai) + 6 naff (A*))] , 
as 7X 2 + o~X < As 

6< < 6 • (1.33(14x + 1) + 1) < IX + 5, 
we have proved that ([8]) is true for i as well. 

We now state and analyze our ultimate solver. 



□ 



x = solve(A, b, e) 

1. Set Xmi n = 1 - 2e~ 2 , X max = (1 + 2e~ 2 )k and i 

2. Run BuildPreconditioners(A). 

3. x = precondCheby(^, b, solve Bl , t, \ m in, ^max) 



0.67v / fcln(2/e) 



Theorem 5.5 (Nearly Linear-Time Solver). On input an irreducible n-by-n SDDMo-maira; A 
with 2m non-zero off-diagonal entries and an n-vector b, with probability at least 1 — 1/500, 
solve( J 4, 6,e) runs in time 



0(mlog C4 mlog(l/e)) +mlog° (1) 



m 



and produces an x satisfying 



x 



A^b 



< e 



A^b 



Proof. By Proposition l5.ll the numbers noff (Ai) are geometrically decreasing, and I < log k ^ x m. 
So we may use Theorem 110.51 to show that the time required to build the preconditioners is at 
most mlog ^ m. If each Bi is a /c-ultra-sparsifier of then the bound on the A-norm of 

the output follows by an analysis similar to that used to prove Lemma 15.31 In this case, we may 
use Lemma 15.41 to bound on the running time of step 3 by 

0(mt) = 0(mVk log(l/e)) = O (m log C4 n log(l/e)) . 

The probability that there is some Bi that is not a A;-ultra-sparsifier of ^4j_i is at most 

1 I 



E 



< 



: , /3x . < 1/500, 



2 dim (Bi) ~ 2(66^ + 6) ~ 2(66x + 6) 



assuming 03,04 > 1. 



□ 



If the non-zero structure of A is planar, then by Theorem 19.51 we can replace all the calls 
to UltraSparsif y in the above algorithm with calls to UltraSimple. By Theorem 19. 1} this 
is like having (k, /i)-ultra-sparsifiers with h = 0(lognlog 2 log re). Thus, the same analysis goes 
through with x = O (log n log 2 log n), and the resulting linear system solver runs in time 



0(n log 2 n + nlogra log 2 logn log(l/e)). 
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We remark that our analysis is very loose when m is much larger than n. In this case, the 
first ultra-sparsifier constructed, B±, will probably have close to n edges, which could be much 
lower than the bound proved in Proposition 15.11 While it is not necessary for the proof of our 
theorem, one could remove this slack by setting B\ = Spars if y(Ao, l/2n) in this case. 

6 Computing Approximate Fiedler Vectors 

Fiedler |Fie73j was the first to recognize that the eigenvector associated with the second-smallest 
eigenvalue of the Laplacian matrix of a graph could be used to partition a graph. From a result 
of Mihail [Mih89], we know that any vector whose Rayleigh quotient is close to this eigenvalue 
can also be used to find a good partition. We call such a vector an approximate Fiedler vector. 

Definition 6.1 (Approximate Fiedler Vector). For a Laplacian matrix A, v is an e-approximate 
Fiedler vector if v is orthogonal to the all-1 's vector and 

v T Av , , . . 
v 1 v 

where A2(^4) is the second-smallest eigenvalue of A. 

Our linear system solvers may be used to quickly compute e-approximate Fiedler vectors. 
We will prove that the following algorithm does so with probability at least 1 — p. 



v = ApproxFiedler^, e,p) 

0.67V3feln(8/e) . 

2. Set k = 81n(18(n- l)/e)/e. 

3. For a = 1, ... , flog 2 p]. 

(a) Run BuildPreconditioners(A). 

(b) Choose r° to be a random unit vector orthogonal to the all-l's vector. 

(c) For b = l,...,k 
r b = precondCheby(^,r b ~ 1 ,solve jBl ,t, A min , A max ). 

(d) Set v a = r k . 

4. Let ao be the index of the vector minimizing v^Av^J v^ v aQ . 

5. Set v = v an . 



Theorem 6.2. On input a Laplacian matrix A with m non-zero entries and e,p > 0, with 
probability at least 1 — 1/p, ApproxFiedler(A, e,p) computes an e-approximate Fiedler vector of 
A in time 

m log '- 1 '' m log(l/p) log(l/ e) / e. 



1. Set Xmin = 1 — 2e 2 , A max = (1 + 2e 2 )k and t = 
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Our proof of Theorem 16.21 will use the following proposition. 

Proposition 6.3. If Z is a matrix such that 

(l-e)Zt ^ A 4 {l + e)Z\ 

and v is a vector such that v T Z^v < (\ + e)\2{Z^), for some e < 1/5, then v is a 4e- approximate 
Fiedler vector of A. 

Proof. We first observe that 

Aa(Zt) < X 2 (A)/(1 - e). 

We then compute 

v T Av < (l + e)v T Z j v 

< (l + e)(l + e)A 2 (Z t ) 

<(l + e)(l + e)A 3 (A)/(l-e) 

< (l + 4e)A 2 (A), 

for e< 1/5. □ 

Proof of Theorem \6.S\ As we did in the proof of Lemma [5 .31 and Theorem 15 .5} we can show that 
precondCheby(A, b, solve b±, t, \ m i n , \ max ) is a linear operator in b. Let Z denote the matrix 
realizing this operator. As in the proof of Lemma 15.31 we can show that (1 — e/4)Z^ =<; A =^ 
(l + e/4)Zt. 

By Proposition 16.31 it suffices to show that with probability at least 1/2 each vector v a 
satisfies 

v T a tfv a /v T a v a <{l + e/A)\ 2 (tf). 

To this end, let = /Ui < fi 2 < • • • < be the eigenvalues of Z\ and let 1 = u\, . . . , u n be 
corresponding eigenvectors. Let 

i>2 

and recall that (see e.g. [SST061 Lemma B.l]) 



\a 2 \ > 2/3y/(n-l) > —= / e"' 2/2 dt > 0.504. 

-I V27TJ2/3 



Pr 

Thus, with probably at least 1/2, the call to BuildPreconditioners succeeds and \a 2 \ > 
2/3-v/ (n — 1). In this case, 

fc > 81n(8/a|e)/e. (9) 
We now show that this inequality implies that r k satisfies 

(r k ) T ZW k , 
\J k) T rk <(l + */*W 

To see this, let j be the greatest index such that fij < (1 + e/8)fi 2 , and compute 

r k = Z k r° = J2<Xi/rfu>i, 

i>2 
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so 

(r k ) T Z^r k Ei>2^/V 



/,jk-i 



^ ^ + T273fc 



< (1+6/8)^2+^2 

< (1 + e/8)^ 2 + M2 



'X^ov/iO 2 *- 1 ' 



OS 



Ei^^ci/a+c/s))*- 1 " 



< (1 + e/8)/i 2 + M2 J] a ? e / 8 inequality ©) 

< (l + e/8)^ 2 + ^ 2 (e/8) 

< (l + e/4)M2. 



□ 



7 Laplacians and Weighted Graphs 

We will find it convenient to describe and analyze our preconditioners for Laplacian matrices 
in terms of weighted graphs. This is possible because of the isomorphism between Laplacian 
matrices and weighted graphs. To an n-by-ra Laplacian matrix A, we associate the graph with 
vertex set {1, . . . , n} having an edge between vertices u and v of weight — A(u, v) for each it and 
v such that A(u, v) is non-zero. 

All the graphs we consider in this paper will be weighed. If u and v are distinct vertices in 
a graph, we write (it, v) to denote an edge between it and v of weight 1. Similarly, if w > 0, 
then we write w(u, v) to denote an edge between it and v of weight w. A weighted graph is then 
a pair G = (V,E) where V is a set of vertices and E is a set of weighted edges on V, each of 
which spans a distinct pair of vertices. The Laplacian matrix Lq of the graph G is the matrix 
such that 

— w if there is an edge w(u, v) G E 

Lg(u, v) = < if it ^ v and there is no edge between it and v in E 

Xw(u,x)eE w if« = «- 
We recall that for every vector x € H n , 

x T L G x= ^2 w(x u -x v f. 

w(u,v)aE 

For graphs G and H, we define the graph G + H to be the graph whose Laplacian matrix is 
Lg + Lh- 



21 



8 Graphic Inequalities, Resistance, and Low-Stretch Spanning 
Trees 



In this section, we introduce the machinery of "graphic inequalities" that underlies the proofs in 
the rest of the paper. We then introduce low-stretch spanning trees, and use graphic inequalities 
to bound how well a low-stretch spanning tree preconditions a graph. This proof provides the 
motivation for the construction in the next section. 
We begin by overloading the notation =<! by writing 

G =4 H and E 4 F 

if G = (V, E) and H = (V, F) are two graphs such that their Laplacian matrices, Lq and Lh 
satisfy 

Lg "4 Lh- 

Many facts that have been used in the chain of work related to this paper can be simply 
expressed with this notation. For example, the Splitting Lemma of [BGH + 06] becomes 

Ai Bi and A 2 4 B 2 implies Ai + A 2 4 B x + B 2 . 

We also observe that if B is a subgraph of A, then 

B 4 A. 

We define the resistance of an edge to be the reciprocal of its weight. Similarly, we define 
the resistance of a simple path to be the sum of the resistances of its edges. For example, 
the resistance of the path Wi(l, 2), w 2 (2, 3), 103(3, 4) is (l/wi + l/w 2 + Of course, the 

resistance of a trivial path with one vertex and no edges is zero. If one multiplies all the weights 
of the edges in a path by a, its resistance decreases by a factor of a. 

The next lemma says that a path of resistance r supports an edge of resistance r. This 
lemma may be derived from the Rank-One Support Lemma of [BH03b] , and appears in simpler 
form as the Congestion-Dilation Lemma of [BGH + 06] and Lemma 4.6 of [Gre96| . 

Lemma 8.1 (Path Inequality). Let e = w(u,v) and let P be a path from u to v. Then, 

e =4 w resistance(P) ■ P. 

Proof. After dividing both sides by w, it suffices to consider the case w = 1. Without loss of 
generality, we may assume that e = (1,/c + 1) and that P consists of the edges Wi(i,i + 1) for 
1 < % < k. In this notation, the lemma is equivalent to 

(1, k + 1) ^ M 1 - 2 ) + ^(2, 3) + • • • + w k (k, k + 1)). 

We prove this for the case k = 2. The general case follows by induction. 
Recall Cauchy's inequality, which says that for all < a < 1, 

{a + hf < a 2 /a + b 2 /(l-a). 
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For k = 2, the lemma is equivalent to 

(sci - x 3 ) 2 < (1 + wi/w 2 )(xi - x 2 ) 2 + (1 + w 2 /wi)(x 2 - x 3 ) 2 , 

which follows from Cauchy's inequality with a = W2/{w\ + w 2 )- □ 

Recall that a spanning tree of a weighted graph G = (V, E) is a connected subgraph of G 
with exactly \V\ — 1 edges. The weights of edges that appear in a spanning tree are assumed to 
be the same as in G. If T is a spanning tree of a graph G = (V, E), then for every pair of vertices 
u,v G V , T contains a unique path from u to t> . We let T(u, v) denote this path. We now use 
graphic inequalities to derive a bound on how well T preconditions G. This bound strengthens 
a bound of Boman and Hendrickson [BH03b, Lemma 4.9]. 



Lemma 8.2. Let G = (V, E) be a graph and let T be a spanning tree of G. Then, 

T^G4(y resistance[ne A-T. 

\ z— / resistance(e) ' 



Proof. As T is a subgraph of G, T =4 G is immediate. To prove the right-hand inequality, we 
compute 



e 



eS-E 

« Y resistance (^)) . T(e)) by LemmaO 

' resistance! e) 

resistance(T(e)) \ ^ T( ) -< T 

resist ance(e) y 



=5! 



□ 



Definition 8.3. Given a tree T spanning a set of vertices V and a weighted edge e = w(u, v) 
with u, v G V, we define the stretch of e with respect to T to be 

resistance^ [e]) . 

strle) = i-T — = w ■ resistanceil (en. 

resistance(e) 

If E is a set of edges on V, then we define 

st T (E) = y s t T {e). 

With this definition, the statement of Lemma 18.21 may be simplified to 

T 4 G 4 st T (E) ■ T. (10) 



We will often use the following related inequality, which follows immediately from Lemma [8. II 
and the definition of stretch. 

w(u,v) =4 stT(w(u,v)) T(u,v) = w stT((u,v)) T(u,v), (11) 

where we recall that T(u, v) is the unique path in T from u to v. 
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9 Preconditioning with Low-Stretch Trees 



In this section, we present a simple preconditioning algorithm, UltraSimple, that works by 
simply adding edges to low-stretch spanning trees. This algorithm is sufficient to obtain all our 
results for planar graphs. For arbitrary graphs, this algorithm might add too many additional 
edges. We will show in Section [TU] how these extra edges can be removed via sparsification. 

9.1 Low-Stretch Trees 

Low-stretch spanning trees were introduced by Alon, Karp, Peleg and West |AKPW 95] . At 
present, the construction of spanning trees with the lowest stretch is due to Abraham, Bartal 
and Neiman ABN08J, who prove 

Theorem 9.1 (Low Stretch Spanning Trees). There exists an 0(m log n + n log 2 n)-time algo- 
rithm, LowStretch, that on input a weighted connected graph G = (V,E), outputs a spanning 
tree T of G such that 

str(E) < cabn ralognloglogra(logloglogn) 3 , 
where m = \E\, for some constant cabn- In particular, stx{E) = 0(?nlogn log 2 logn). 

9.2 Augmenting Low-Stretch Spanning Trees 

To decide which edges to add to the tree, we first decompose the tree into a collection of 
subtrees so that no non-singleton subtree is attached to too many edges of E of high stretch. 
In the decomposition, we allow subtrees to overlap at a single vertex, or even consist of just a 
single vertex. Then, for every pair of subtrees connected by edges of E, we add one such edge 
of E to the tree. The subtrees are specified by the subset of the vertices that they span. 

Definition 9.2. Given a tree T that spans a set of vertices V , a T -decomposition is a decom- 
position of V into sets W\, . . . , Wh such that V = UWi, the graph induced by T on each Wi is a 
tree, possibly with just one vertex, and for alii 7^ j, \WiC\ Wj | < 1 . 

Given an additional set of edges E onV , a (T, E)- decomposition is a pair ({Wi, . . . , Wh} , p) 
where {Wi, . . . , Wh} is a T -decomposition and p is a map that sends each edge of E to a set or 
pair of sets in {W±, . . . , Wh} so that for each edge in (u, v) £ E, 

(a) if p(u,v) = {Wi} then {u, v} £ Wi, and 

(b) if p(u, v) = {Wi, Wj}, then either u G Wi and v G Wj, or u G Wj and v G Wi. 

We remark that as the sets Wi and Wj can overlap, it is possible that p(u,v) = {Wi,Wj}, 
u£Wi and v G Wi n Wj. 

We use the following tree decomposition theorem to show that one can always quickly find a 
T-decomposition of E with few components in which the sum of stretches of the edges attached 
to each component is not too big. As the theorem holds for any non-negative function rj on the 
edges, not just stretch, we state it in this general form. 



24 



Figure 2: An example of a tree decomposition. Note that sets W± and Wq overlap, and that set 
W$ is a singleton set and that it overlaps W4. 



Theorem 9.3 (decompose). There exists a linear-time algorithm, which we invoke with the 
syntax 

({Wi, . . . , W h } , p) = decompose(T, E, 77, t), 

that on input a set of edges E on a vertex set V , a spanning tree T onV , a function r\ : E — > IR + , 
and an integer 1 < t < X^eG-E v( e )> outputs a (T, E) -decomposition ({Wi, . . . , Wh} , p), such that 

(a) h<t, 

(b) for all Wi such that \Wi\ > 1, 

e£E:Wi£p{e) e&E 

For pseudo-code and a proof of this theorem, see Appendix O We remark that when t > n, 
the algorithm can just construct a singleton set for every vertex. 

For technical reasons, edges with stretch less than 1 can be inconvenient. So, we define 

77(e) = max(str(e), 1) and r)(E) = 77(e). (12) 

The tree T should always be clear from context. 

Given a (T, ^-decomposition, ({Wi, . . . , Wh} , p), we define the map 

a : {1, . . . , h} x {1, . . . , h} — > E U {undefined} 

by setting 

a U j\ - i ar S max e:p(e)={Wi,w 3 } weight {e) / r](e) , if i ^ j and such an e exists 
1 undefined otherwise. 

In the event of a tie, we let e be the lexicographically least edge maximizing weight (e) /77(e) such 
that p(e) = {Wi, Wj}. Note that cr(i,j) is a weighted edge. 

The map a tells us which edge from E between Wi and Wj to add to T. The following 
property of a, which follows immediately from its definition, will be used in our analysis in this 
and the next section. 



25 



Proposition 9.4. For every i,j such that o~(i,j) is defined and for every e S E such that 
p(e) = {W i ,W j }, 

weight(e) weight (a(i,j)) 
77(e) ~ r]{a{i,j)) 

We can now state the procedure by which we augment a spanning tree. 



F = AugmentTree(T, E, t), 

E is set of weighted edges, 

T is a spanning tree of the vertices underlying E, 
t is an integer. 

1. Compute stT(e) for each edge e G E. 

2. Set ((Wi, . . . , W/J , /o) = decompose(T, E 1 , 77, i), where 77(e) is as denned in ([12 



3. Set F to be the union of the weighted edges cr(i,j) over all pairs 1 < i < j < h for which 
cr(i,j) is defined, where cr(i,j) is as defined in (fT3"|) . 



^4 = UltraSimple(.E, rj) 

1. Set T = LowStretch(E'). 

2. Set F = Augment Tree(T, E, t). 

3. Set A = T U F. 



We remark that when t > n, UltraSimple can just return A = E. 

Theorem 9.5 (Augment Tree). On input a set of weighted edges E, a spanning subtree T, and 
an integer 1 < t < r)(E), the algorithm AugmentTree runs in time 0{m\ogn), where m = \E\. 
The set of edges F output by the algorithm satisfies 

(a) FCE, 

(b) \F\ < t 2 /2, 

(c) IfTQE, as happens when AugmentTree is called by UltraSimple, then (TU F) =^ E. 
(d) 

E^ 12r, ( E > . (r u F). (14) 
Moreover, if E is planar then A is planar and \F\ < 3t — 6. 

Proof. In Appendix [Bl we present an algorithm for computing the stretch of each edge of E in 
time 0(m log n). The remainder of the analysis of the running time is trivial. Part (a) follows 
immediately from the statement of the algorithm. When T C E, T U F C E, so part (c) follows 
as well. 
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To verify (b), note that the algorithm adds at most one edge to F for each pair of sets in 
Wi, . . . , Wh, and there are at most L) < t 2 /2 such pairs. If E is planar, then F must be planar 
as F is a subgraph of E. Moreover, we can use Lemma lC.ll to show that the graph induced by 
E on the sets Wi, . . . , Wh is also planar. Thus, the number of pairs of these sets connected by 
edges of E is at most the maximum number of edges in a planar graph with t vertices, 3t — 6. 

We now turn to the proof of part (d). Set 

= (15) 
By Theorem 19.31 P an d W\ , ■ ■ ■ , Wh satisfy 

v(e)<P, for all W such that |Wj| > 1. (16) 

e:Wj6p(e) 

Let Ef 1 * denote the set of edges e with p(e) = (Wi, Wi), and let Ef xt denote the set of edges 
e with \p(e)\ = 2 and W { G p(e). Let E int = UiEj nt and E ext = UiEf xt . Also, let T { denote 
the tree formed by the edges of T inside the set Wi. Note that when \Wi\ = 1, Tj and Ef 11 are 
empty. 

We will begin by proving that when \W%\ > 1, 

ETU I E ^ e )) T - ( 17 ) 

from which it follows that 

E im^ I E ^jr,. (18) 

i:|Wi|>l \e£El nt J 

For any edge e S -BJ" - *, the path in T between the endpoints of e lies entirely in Tj. So, by 
(fTTI) we have 

e st T (e) • Tj ^ 97(e) • Tj. 

Inequality fjlTj) now follows by summing over the edges e G £'^ n *. 
We now define the map r : E — > Tv U {undefined] by 

r (e) = if M e )l = 2 ' where = W' 1 ^}' and ( 19 ) 

1 undefined otherwise. 

To handle the edges bridging components, we prove that for each edge e with p(e) = (Wi, Wj), 

e 4 3r?(e)(Tj + 2}) + 3 ■ r(e) (20) 

weight(r(ej) 

Let e = f) be such an edge, with u £ Wi and u £ Wj. Let r(e) = ?/), with x £ Wi and 
y G Wj . Let t i denote the last vertex in Tj on the path in T from u to v (see Figure [3]) . If Tj is 
empty, tj = u. Note that tj is also the last vertex in Tj on the path in T from x to y. Define tj 
similarly. As T(n, x) C T(n, tj) U Tj(tj, x), the tree Tj contains a path from u to x of resistance 
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Figure 3: In this example, e = w(u,v) and r(e) = z(x, y). 



at most 

resistance(Tj(n, tj)) + resistance (T(ij, x)), 
and the tree Tj contains a path from y to v of resistance at most 

resistance(Tj (y, tj)) + resistance(Tj (t j ,v)). 

Furthermore, as T(u,U) +Tj(tj,v) C T(u,v) and Ti(ti,x) + Tj(y,tj) C T(x,y), the sum of the 
resistances of the paths from u to x in Tj and from y to v in Tj is at most 

resistance(T(u, v )) + resistance (T(x, y)) = stT(e)/u> + stT(r(e))/ z 

< r/(e)/w + r}{r{e))/z 

< 2r](e)/w, 

where the last inequality follows from Proposition 19.41 Thus, the graph 

37Ke)(T + Tj) + 3w(x,y) = 3 V (e)(T t + Tj) + 3 ^ft 6 ) . r(e ) 

weignt(r(e)j 

contains a path from u to v of resistance at most 

2 1 1 1 1 

3w 3w in' 

which by Lemma 18 . 1 1 implies (I20D . 

We will now sum (|2(J|) over every edge e G T? 2 ^ for every i, observing that this counts every 
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edge in E ext twice. 

E^ = (1/2)J2 E e 



i e£Ef 



= 3V f V r^U + S V Ty? ( ^ -r(e) 
^ I ^ ,w I ^ weight(r(e)) V ; 



i:\Wi\>l \e£Ef xt / e£E ext V V " 

as Tj is empty when \Wi\ = 1. 

We will now upper bound the right-hand side of (|21|) . To handle boundary cases, we divide 
_E ext into two sets. We let E^ gle consist of those e G £; ea; * for which both sets in p(e) have 
size 1. We let E e g xt neral = E ext — E ex ^ gle contain the rest of the edges in E ext . For e G E ex ^ gle , 
r(e) = e, while for e G E^ eral , r(e) G £^* e m*- 

For ^g/e- we have 

E 7 ^ • ^) = E ^)=^e- 

^ weightfr(e)) ^ 9 

single single 

To evaluate the sum over the edges e G E g x ^ eral , consider any / G E e g xt nerol in the image of 
r. Let i be such that / G Ef xt and |Wj| > 1. Then, for every e such that r(e) = /, we have 
e G Sf^. So, by Proposition 19.41 



E weight (e) weight (e) 



e£Eext weight(r(e)) weight(r(e)) 

T(e)=f r(e)=f 

weight (e) vfe 



e6 ^„ weight(r(e)) -l(r(e)) ^ 

Thus, 



E />■/</»■* 

eefiexi o \ \ tj /eimage(-r) 

f£E ext , 

J general 

Plugging this last inequality into (f2T|) . we obtain 



£ e ^3 E E ^)|r 4 + 3/3-F. 

i:|Wi|>l \e6.Ef* 
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Applying (fTBj) and then (fT6j) . we compute 



E = E ext + E int ^ 3 X) Ti Z) ^ e ) + Z ^(e) J +3/3-F ^ 3/3- (TUF), 



which by (|15p implies the lemma. □ 

We now observe three sources of slack in Theorem 19. 5\ in decreasing order of importance. 
The first is the motivation for the construction of ultra-sparsifiers in the next section. 

1. In the proof of Theorem 19.51 we assume in the worst case that the tree decomposition 
could result in each tree Tj being connected to t — 1 other trees, for a total of t(t — l)/2 
extra edges. Most of these edges seem barely necessary, as they could be included at a 
small fraction of their original weight. To see why, consider the crude estimate at the end 
of inequality (|22l) . We upper bound the multiplier of one bridge edge / from , 



E 



weight (e) 



weight (r(e)) ' 

e£E ext 
r(e)=f 

by the sum of the multipliers of all bridge edges from Tj. 

weight (e) 



E 



lB i„, weight(r(e)) ' 

The extent to which this upper bound is loose is the factor by which we could decrease 
the weight of the edge / in the preconditioner. 

While we cannot accelerate our algorithms by decreasing the weights with which we include 
edges, we are able to use sparsifiers to trade many low-weight edges for a few edges of 
higher weight. This is how we reduce the number of edges we add to the spanning tree to 
ilog° (1) n. 

2. The number of edges added equals the number of pairs of trees that are connected. While 
we can easily obtain an upper bound on this quantity when the graph has bounded genus, 
it seems that we should also be able to bound this quantity when the graph has some nice 
geometrical structure. 



3. The constant 12 in inequality (|14p can be closer to 2 in practice. To see why, first note 
that the internal and external edges count quite differently: the external edges have three 
times as much impact. However, most of the edges will probably be internal. In fact, if 
one uses the algorithm of [ABN08] to construct the tree, then one can incorporate the 
augmentation into this process to minimize the number of external edges. Another factor 
of 2 can be saved by observing that the decomposeTree, as stated, counts the internal 
edges twice, but could be modified to count them once. 
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10 Ultra- Sparsifiers 



We begin our construction of ultra-sparsifiers by building ultra-sparsifiers for the special case in 
which our graph has a distinguished vertex r and a low-stretch spanning tree T with the property 
that for every edge e E E — T, the path in T connecting the endpoints of e goes through r. 
In this case, we will call r the root of the tree. All of the complications of ultra-sparsification 
will be handled in this construction. The general construction will follow simply by using tree 
splitters to choose the roots and decompose the input graph. 

The algorithm RootedUltraSparsif y begins by computing the same set of edges o~(i,j), 
as was computed by UltraSimple. However, when RootedUltraSparsif y puts one of these 
edges into the set F, it gives it a different weight: u(i,j). For technical reasons, the set F is 
decomposed into subsets F b according to the quantities 4>(f ), which will play a role in the analysis 
of RootedUltraSparsif y analogous to the role played by 77(e) in the analysis of UltraSimple. 
Each set of edges F b is sparsified, and the union of the edges of E that appear in the resulting 
sparsifiers are returned by the algorithm. The edges in F b cannot necessarily be sparsified 
directly, as they might all have different endpoints. Instead, F b is first projected to a graph H b 
on vertex set {1, . . . , h}. After a sparsifier H b of H b is computed, it is lifted back to the original 
graph to form E b . Note that the graph E s returned by RootedUltraSparsif y is a subgraph of 
E, with the same edge weights. 

We now prove that F = u[ 1 °l 2v[E)] F b . Our proof will use the function ry, which we recall 
was defined in (|12p and which was used to define the map a. 

Lemma 10.1. For <f> as defined in (|24p . for every f = ip(i,j)a(i,j) £ F, 

l<ip(i,j)<<f>(f)<r)(E). (25) 

Proof. Recall from the definitions of </> and ip that 

£ eG fM e) =W,uO} wei S ht ( e ) 



4>(f)>^(iJ) 



weight(cr(i,j)) 



By definition cr(i,j) is an edge in E satisfying p(a(i,j)) = {Wi,Wj}; so, the right-hand side of 
the last expression is at least 1. 

To prove the upper bound on <p(f), first apply Proposition 19.41 to show that 

„ . = E egg:p(e ) ={ H/ i ,H/ J }Weight(e) ^ EeeE: P {e)={w i ,w j }^) < , E - 
weight (cr(i,j)) n(a(i,j)) 

as T) is always at least 1. Similarly, 



w 0,j) , 1 c st T (o-(i,j)) 
Ah 3)) 

3ight(cr(z, j 

where the second-to-last inequality follows from Proposition 19.41 □ 



S W) = ■ w c •v > st n <7 (*»j)) = — ■ rr, J- -w 1^ weight(e) 

weight( f x(7, J )) weight(a(,,j)) \ eeE , p(e ^ Wj} 

?M ,3 rv I £ weight(e) j < Y, V(e) < n{E), 

weight (an, 7 \ ] ^-^ 
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E s = RootedUltraSparsif y(E, T, r, t,p) 

Condition: for all e G E, r G T(e). The parameter t is a positive integer at most ["77(F)]. 

1. Compute str(e) and 77(e) for each edge e & E, where 77 is as defined in (fT2|) . 

2. If t > |F|, return E s = E. 

3. Set ({Wi,... ,W h },p) = decompose(T, F, 77, t). 

4. Compute a, as given by (fl3j) . everywhere it is defined. 

5. For every such that is defined, set 



weight(e) and = j)/ weight (a 



(23) 



eeE:p(e)={M/ I ,H/ j } 



6 



Set F = {tp(i, j)<r(i, j) : cr(i,j) is defined} . 
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For each / = tp(i, j)a(i, j) G F, set 



<£(/) = max(^(i,i),st T (/)). 



(24) 
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For b G {!,..., rj(E)-]}: 




H b = : il>(i,j)v(i,j) G F b } 




9 



Set E s = U b E°. 
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It will be convenient for us to extend the domain of p to F by setting p(f) = p(e) where 
e £ E has the same vertices as /. That is, when there exists 7 6 M + such that / = 7c Define 

= Ar,{E)/t. 

Our analysis of RootedUltraSparsif y will exploit the inequalities contained in the following 
two lemmas. 

Lemma 10.2. For every i for which \Wi\ > 1, 

E st ^f) ^ p- 

feF:Wie P (f) 

Proof. Consider any / G F, and let / = ip(i, j)a(i, j). Note that the weight of / is cj(i,j), and 
recall that stx(f) < ??(/)• We first show that 

E V(e) > t/(/). 

e:r(e)=o-(i,j) 



By Proposition 19.41 and the definition of r in (|19p 

> »7 e > — . r— ; :77 > weight (e) 
weight ex?, j) 

e:r(e)=cr(jj) e:r(e)=(r(i j) 

-weight (/) 



weight (cr(i, j )) 



We then have 



/ weight(/) st T (cr(z,j)) . 

= max ■ w /■ rrr; r. ... weight (/) 

V weight (ctO, j )) weight(cr(z,j)J 

= max (<£(/), st T (/)) (by([24D) 
>max(l,st T (/)) (by©) 

= »/(/)■ 



E ^ e )^ E rt/)- 

e6S:Wie/>(e) f€F:W t ep(f) 



The lemma now follows from the upper bound of 4rj(E)/t imposed on the left-hand term by 
Theorem [Ql □ 



Lemma 10.3. For every i for which \Wi\ > 1, 



E M) ^ ( 26 ) 

feF-.Wiep(f) 
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Proof. For an edge / G F, let ?/>(/) equal where / = tp(i,j)a(i,j). With this notation, 

we may compute 

E <K/)^ E st ^) + E w) 

/eF:^Gp(/) /eiW,£p(/) /eF:Wi6p(/) 

< E ^(/)+ E v>(/) 

/GF:H/,6p(/) f£F:Wiep(f) 

<p+ E 

/6f:Wiep(/) 

by Lemma 110.21 We now bound the right-hand term as in the proof of inequality (|22p : 

E *(/) = £ ^TTiT £ E TTW £ E *0 S A 

, , ^-^ weight (r(ej) t rj{ T ( e )) , 

by our choice of /? and Theorem 19.31 □ 

Lemma 10.4 (RootedUltraSparsif y). Let T be a spanning tree on a vertex set V , and let E 

be a non-empty set of edges on V for which there exists an r G V be such that for all e G E, 
r G T(v). For p > and t a positive integer at most \rj(E)~\, let E s be the graph returned by 
RootedUltraSparsify(E,T,r,t,p) . The graph E s is a subgraph of E, and with probability at 
least 1 - [log 2 n(E)~\ p, 

\E S \ < cilog C2 (n/p)max(l, \log 2 r,(E)])t, (27) 

and 

E^ (3(3 + 126/3 max(l, log 2 rj(E))) -T + 120/3 ■ E s , (28) 

where (3 = 4r](E)/t. 

Proof. We first dispense with the case in which the algorithm terminates at line 2. If t > m, 
then both (|27h and f)28|) are trivially satisfied by setting E s = E, as (3 > 2. 

By Theorem 11.31 each graph H b computed by Sparsify2 is a c\ log C2 (n/p)-sparsifier of H b 
according to Definition 1 1 . 21 with probability at least 1 —p. As there are at most [log 2 t](E)~\ such 
graphs H b , this happens for all of these graphs with probability at least 1 — |~log 2 r](E)~\ p. For 
the remainder of the proof, we will assume that each graph H b is a c\ log C2 (ra/p)-sparsifier of 
H b . Recalling that h < t, the bound on the number of edges in E s is immediate. 

Our proof of (|28|) will go through an analysis of intermediate graphs. As some of these could 
be multi-graphs, we will find it convenient to write them as sums of edges. 

To define these intermediate graphs, let rj be the vertex in Wj that is closest to r in T. As in 
Section EJ let Tj denote the edges of the subtree of T with vertex set Wj . We will view as the 
root of tree T,. Note that if | Wi| = 1, then Wi = {r{\ and Ti is empty. As distinct sets Wi and 
Wj can overlap in at most one vertex, T{ < T. We will exploit the fact that for each e G E 
with p(e) = {Wj, Wj}, the path T(e) contains both and rj, which follows from the condition 
r G T(e). 
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We now define the edge set D b , which is a projection of H b to the vertex set r±, . . . , r^, and 
D b , which is an analogous projection of the sparsifier H b . We set 

D b = "(hjKrurj) 

(i,j):i>(i,j)a(i,j)£F b 

and 

As the sets Wi and Wj are allowed to overlap slightly, it could be the case that some rj = rj for 
i 7^ j. In this case, D b would not be isomorphic to H b . 
Set 

F s = ■ 3 7 and so that -yui(i, j) G H b } . 

The edge set H b can be viewed as a projection of the edge set F b to the vertex set {1, . . . , h}, 
and the edge set F b can be viewed as a lift of H b back into a reweighted subgraph of F b . 
We will prove the following inequalities 

E 4 3(3 ■ T + 3F (29) 

F b 4 2j3 ■ T + 2D b (30) 

D b 4 (5/A)D b (31) 

D b ^ 16(3 • T + 2F b (32) 

F b 4 8(3 ■ E b (33) 

Inequality (128j) in the statement of the lemma follows from these inequalities and F = F b . 

To prove inequality (|29p , we exploit the proof of Theorem 19.51 The edges F constructed 
in RootedUltraSparsif y are the same as those chosen by UltraSimple, except that they are 
reweighted by the function tfj. If we follow the proof of inequality (|14|) in Theorem 19.51 but 
neglect to apply inequality (f22j) . we obtain 

+ 3 £ W ^\ .r(e) = 3p.T + 3F 
J^L weight(r(e)) 

To prove inequality (f30l) . consider any edge w(u,v) = f G F b . Assume p(f) = {Wi,Wj}, 
u G Wi and v G Wj. We will now show that 



/ ^ 2st T (/)(Ti + Tj) + 2w(r h rj). (34) 
As the path from u to v in T contains both ri and rj, 

resistance (T(u, rj)) + resistance(T(rj, v )) < resistance (T(u, v )) = stT{f)/w. 
Thus, the resistance of the path 

2str(/)r(u, n) + 2w(r u Tj ) + 2st T (f)T{rj,v) 
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is at most 1/w, and so Lemma 18 . 1 1 implies that 

/ ^ 2st T (f)T{u,r i ) + 2w(r i ,r j ) + 2st T (f)T(r j ,v), 
which in turn implies (|34p . Summing (|34p over all / G F b yields 

^ 2 E( E st T (/)]r i + 2i? b 



Fb ^ 2 E E st T (/))r i + 2L> 6 

i:|Wi|>l \/GF:W ! ep(/) 

^2^/3-T 4 + 2ZA 



as Tj is empty when |Wj| = 1 
by Lemma 110.21 



^ 2f3-T + 2D b . 

We now prove inequality (|32p . as it uses similar techniques. Let f s = w(u,v) G -Fj\ Then, 
there exist 7 and so that 70; (i, j) G it G W{, and u G Wj. Set 7(/ s ) to be this 
multiplier 7. By part (c) of Definition ! 1.21 we must have uj(i,j)(i,j) G if 6 and tp(i,j)a(i ) j) G 
Let / = ip(i,j)a(i,j). Note that / s = "f(f s )f- The sum of the resistances of the paths from 
to u in Ti and from t; to rj in Tj is 

resistance(T(rj, u)) + resistance (T(u, rj)) < resistance(T(u, v)) = str(/)/w(i, j), 

as weight (/) = u>(i,j). Thus, the resistance of the path 

2st T (/)T(r i , u) + 2/ + 2st T (f)T(v, rj ) 

is at most l/u(i,j), and so Lemma 18.11 implies that 

u&Mn,^) 4 2at T (f)(T i + T j ) + 2f, 

and 

rtf.MhMruTj) 4 2 7 (/ s )st T (/)(T i + T,-) + 2/ s 
^2 7 (/ s )0(/)(r i +r j ) + 2/ a 
^2 fe+1 7(/ s )(T i + T J ) + 2/ s 
Summing this inequality over all f s G F b , we obtain 

D» s 4j2[ 2b+1 E 7(/.))t, + 2** 

' \ f s eFl:Wie P (f s ) 

For all i such that \WA > 1, 



(by (2 
(by / G F b 



J]) 7(/s)<2|{/GF 6 :W,,Gp(/)} 

f s eFb:Wiep(f s ) 

< 2 J] <^(/)/2"- 1 

feFb-.Wiep(f) 

< A(3/2 b - 1 
= (3/2 b ~ 3 . 



(part (d) of Definition [L~2"]) 



(by Lemma flO. 3p 



(35) 
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So, 

D b s 4 16 @ • T i + 2F s + 2F b . 

i 

To prove inequality ([33]) . let f s be any edge in F s , let / be the edge in F such that f s = r y(f s )f, 
and let cr(i,j) be the edge such that f s = 7(/ s )Y'(^, j) a {hj)- It suffices to show that 

weighty) < 8/3 weight(ff(*,j)). (36) 

Set b so that f £ F b . By (|55j). 

7(/ s ) < /?/2 b ~ 3 < 8/3/0(/) = 8/3/max(^(i, j),st T (/)) < 8/3/^(i,i). 

As weight (/ s ) = j(f s )ijj(i, j)weight(<r(i, j)), inequality (f36l) follows. 

It remains to prove inequality (|3ip . The only reason this inequality is not immediate from 
part (a) of Definition 11.21 is that we may have r, = r, for some % ^ j. Let R = {n, . . . , r^} and 
5 = {1, . . . , /i}, Define the map tt : M R — ► IR 5 by vr(x)j = x n . We then have for all x G 

x T L D bX = ir(x) T L H bir(x) and x T L D bX = ir(x) T L H bir(x); 

so, 

x T L D bX = tt(x) t L H bTr(x) < (5/4)tt(x) t L H bir(x) = (5/4)x T L D bX. 

□ 

The algorithm UltraSparsif y will construct a low-stretch spanning tree T of a graph, choose 
a root vertex r, apply RootedUltraSparsif y to sparsify all edges whose path in T contains r, 
and then work recursively on the trees obtained by removing the root vertex from T. The root 
vertex will be chosen to be a tree splitter, where we recall that a vertex r is a splitter of a tree 
T if the trees T 1 , . . . ,T g obtained by removing r each have at most two-thirds as many vertices 
as T. It is well-known that a tree splitter can be found in linear time. By making the root a 
splitter of the tree, we bound the depth of the recursion. This is both critical for bounding the 
running time of the algorithm and for proving a bound on the quality of the approximation it 
returns. For each edge e such that r T(e), T(e) is entirely contained in one of T 1 , . . . ,T q . 
Such edges are sparsified recursively. 



U = UltraSparsif y(G = (V, E), k) 
Condition: G is connected. 






1. T = LowStretch(E). 






2. Set t = 517 • max(l, log 2 rj(E)) ■ 


lo g3/2 n 


r)(E)/k and p= (2 [log r](E)] n 2 ) \ 


3. If t > ij(E) then set A = E — T; otherwise, set A = TreeUltraSparsif y(E — T, t, T,p). 


4. U = TUA. 
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A 

A = 


rp TTT J_ O • j; / 7~r/ i/ /"T"l/ \ 

TreeUltraSparsif y(i/ , r , i ,p) 




1. 


If £" = 0, return A = 0. 




2. 


Compute a splitter r oi 1 . 




3. 


Set £ r = {edges e £ E' such that r G T'(e)} and r r = \tri(E r )/ri(E')] • 




4. 


If t r > 1, set A r = RootedUltraSparsif y(E r , T' , r, t r ,p); otherwise, set A r 


= 0. 


5. 


Set T , . . . , T q to be the trees obtained by removing r from T' . Set V 1 , . . . 
vertex sets of these trees, and set E , . . . , E q so that E % = { (u, v) € E' : {u 


, y« to be the 

u}cy<}. 


6. 


For i = 1, . . . , g, set 

A = A r UTreeUltraSparsify( J EV'?7(£; i )/?7(£; / ) 5 Ti 5 p)- 





Theorem 10.5 (Ultra-Sparsification). On input a weighted, connected n-vertex graph G = 
(V, E) and k > 1, UltraSparsify(E, k) returns a set of edges U = T U A C E such that T is a 
spanning tree of G, U C E, and with probability at least 1 — l/2n, 

U ^E^kU, (37) 

and 

\A\ < O (^log C2+5 n) , (38) 

where m = \E\. Furthermore, UltraSparsify runs in expected time mlog°^ n. 

We remark that this theorem is very loose when m/k > n. In this case, the calls made to 
decompose by RootedUltraSparsif y could have t > n, in which case decompose will just return 
singleton sets, and the output of RootedUltraSparsif y will essentially just be the output of 
Sparsify2 on E r . In this case, the upper bound in (|38l) can be very loose. 

Proof. We first dispense with the case t > n(E). In this case, UltraSparsify simply returns 
the graph E, so ([3?]) is trivially satisfied. The inequality t > rj(E) implies k < 0(log 2 n), so ([38]) 
is trivially satisfied as well. 

At the end of the proof, we will use the inequality t < rj(E). It will be useful to observe that 
every time TreeUltraSparsif y is invoked, 

t' = t V (E')/n(E). 

To apply the analysis of RootedUltraSparsif y, we must have 

f < \n(E r )] . 

This follows from 

f = \t' V (Er)/ V (E')] = \t V (E r )/v(E)] < \ V (E r )] , 
as TreeUltraSparsif y is only called if t < T](E). 
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Each vertex of V can be a root in a call to RootedUltraSparsif y at most once, so this sub- 
routine is called at most n times during the execution of UltraSparsif y. Thus by Lemma [10.41 
with probability at least 

1 — n |~log 2 rj(E)] p = 1 - l/2n, 

every graph E s returned by a call to RootedUltraSparsif y satisfies (|27p and (|28|) . Accordingly, 
we will assume both of these conditions hold for the rest of our analysis. 

We now prove the upper bound on the number of edges in A. During the execution of 
UltraSparsif y, many vertices become the root of some tree. For those vertices v that do not, 
set t„ = 0. By US 



\A\= J2 \ A "\ <cilog C2 (7i/p)max(l,riog 2?? (i?)l) £ t r . (39) 

r&V:t r >l r^V:t r >l 

As \z] < 2z for z > 1 and E Tl n E T2 = for each n / r 2 , 



r£_V:t r >l reV:tr>l 



n(E) 



2r](E r ) 



rev 



t>l 



Thus, 



P} < 2 Cl log C2 (n/p) [logs 7?(^)] t 

< 2 Cl log C2 (n/p) [log, ri(E)] 517 • log 2 t?(£) • [log 3/2 n] r/(£;)/fc 

< o(^log C2+5 n) , 

where the last inequality uses ^(-E 1 ) = 0(m log n log 2 n) = 0(m log 2 n) from Theorem 19.11 and 
log m = 0(log n). 

We now establish (|37p . For every vertex r that is ever selected as a tree splitter in line 2 of 
TreeUltraSparsif y, let T r be the tree T' of which r is a splitter, and let E r denote the set of 
edges and t r be the parameter set in line 3. Observe that U r E r = E — T. Let 

Pr = HE r )/tr, 

and note this is the parameter used in the analysis of RootedUltraSparsif y in Lemma [10.41 If 
t r > 1, let A r be the set of edges returned by the call to RootedUltraSparsif y. By Lemma [10.41 
RootedUltraSparsif y returns a set of edges A r satisfying 

E r 4 (3/? r + 126/3 r max(l, log 2 r](E r ))) • T r + 120/? r ■ A r . (40) 

On the other hand, if t r = 1 and so A r = 0, then (3 r = Arj{E r ). We know that (f4T)|) is satisfied in 
this case because E r 4 r](E r )T r (by CED). If t r = 0, then E r = and (gOD is trivially satisfied. 
As U = \t V (E r )/v(E)~] , 

Pr < ME)/t. 

We conclude 

E r 4 129p r max(l,log 2 ?](E r ))-T r +120p r -A r 4 516{r](E)/t) max(l, log 2 r](E r ))T r +l20(r](E)/t)A r . 
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Adding T, summing over all r, and remembering i](E r ) < r](E), we obtain 

T + (E - T) 4 T + 516{7](E)/t) max(l, log 2 r](E)) ^ T r + l20(rj(E)/t)A. 



As r is always chosen to be a splitter of the tree input to TreeUltraSparsif y, the depth of the 



recursion is at most 



log 



3/2 



n 



sum T r , and we may cone 



. Thus, no edge of T appears more than 
ude 



log 



3/2 



n 



times in the 



T + (E-T) 4T + 516{rj(E)/t) max(l, log 2 r,(E)) 



log 3 / 2 n 



T + 120(rj(E)/t)A 



4 5n(7](E)/t) max(l,log 2 n(E)) 

4 k(T + A) 
= kU, 



log 3/2 n 



T + 120(rj(E)/t)A 



where the second inequality follows from t < t](E), and the third inequality follows from the 
value chosen for t in line 2 of UltraSparsif y. 

To bound the expected running time of UltraSparsif y, first observe that the call to 
LowStretch takes time 0(m log 2 n). Then, note that the routine TreeUltraSparsif y is re- 
cursive, the recursion has depth at most O(logn), and all the graphs being processed by 
TreeUltraSparsif y at any level of the recursion are disjoint. The running time of TreeUltraSpars 
is dominated by the calls made to Sparsif y2 inside RootedUltraSparsif y. Each of these takes 
nearly-linear expected time, so the overall expected running time of TreeUltraSparsif y is 
0(mlog 0(1) n). □ 



References 

[ABN08] I. Abraham, Y. Bartal, and O. Neiman. Nearly tight low stretch spanning trees. 

In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer 
Science, pages 781-790, Oct. 2008. 

[AKPW95] Noga Alon, Richard M. Karp, David Peleg, and Douglas West. A graph-theoretic 
game and its application to the fc-server problem. SIAM Journal on Computing, 
24(1):78-100, February 1995. 

[Axe85] O. Axelsson. A survey of preconditioned iterative methods for linear systems of 
algebraic equations. BIT Numerical Mathematics, 25(1):165-187, March 1985. 

[BBC + 94] R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, 
R. Pozo, C. Romine, and H. Van der Vorst. Templates for the Solution of Linear 
Systems: Building Blocks for Iterative Methods, 2nd Edition. SIAM, Philadelphia, 
PA, 1994. 

[BCHT04] Erik G. Boman, Doron Chen, Bruce Hendrickson, and Sivan Toledo. Maximum- 
weight-basis preconditioners. Numerical linear algebra with applications, 11(8- 
9):695-721, October/November 2004. 



40 



[BGH + 06] M. Bern, J. Gilbert, B. Hendrickson, N. Nguyen, and S. Toledo. Support-graph 
preconditioners. SIAM J. Matrix Anal. & Appl, 27(4):930-951, 2006. 

[BH01] Erik Bornan and B. Hendrickson. On spanning tree preconditioners. Manuscript, 
Sandia National Lab., 2001. 

[BH03a] Mario Bebendorf and Wolfgang Hackbusch. Existence of H-matrix approximants 
to the inverse FE-matrix of elliptic operators with L°°-coemcients. Numerische 
Mathematik, 95(1): 1-28, July 2003. 

[BH03b] Erik G. Boman and Bruce Hendrickson. Support theory for preconditioning. SIAM 
Journal on Matrix Analysis and Applications, 25(3):694-717, 2003. 

[BHM01] W. L. Briggs, V. E. Henson, and S. F. McCormick. A Multigrid Tutorial, 2nd 
Edition. SIAM, 2001. 

[BHV04] Erik G. Boman, Bruce Hendrickson, and Stephen A. Vavasis. Solving elliptic fi- 
nite element systems in near-linear time with support preconditioners. CoRR, 
cs.NA/0407022, 2004. 

[BMN04] M. Belkin, I. Matveeva, and P. Niyogi. Regularization and semi-supervised learning 
on large graphs. Proc. 17th Conf. on Learning Theory, pages 624-638, 2004. 

[BSS09] Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice- Ramanuj an 
sparsifiers. In Proceedings of the 41st Annual ACM Symposium on Theory of com- 
puting, pages 255-262, 2009. 

[CW82] D. Coppersmith and S. Winograd. On the asymptotic complexity of matrix multi- 
plication. SIAM Journal on Computing, ll(3):472-492, August 1982. 

[DER86] I. S. Duff, A. M. Erisman, and J. K. Reid. Direct Methods for Sparse Matrices. 
Oxford Science Publications, 1986. 

[DS07] Samuel I. Daitch and Daniel A. Spielman. Support-graph preconditioners for 2- 
dimensional trusses. CoRR, abs/cs/0703119, 2007. 

[DS08] Samuel I. Daitch and Daniel A. Spielman. Faster approximate lossy generalized flow 
via interior point algorithms. In Proceedings of the 40th Annual ACM Symposium 
on Theory of Computing, pages 451-460, 2008. 

[EEST08] Michael Elkin, Yuval Emek, Daniel A. Spielman, and Shang-Hua Teng. Lower- 
stretch spanning trees. SIAM Journal on Computing, 32(2):608-628, 2008. 

[EGB05] Ojas Parekh Sivan Toledo Erik G. Boman, Doron Chen. On factor width and 
symmetric /i-matrices. Linear Algebra and its Applications, 405(1) :239-248, August 
2005. 

[FG04] A. Frangioni and C. Gentile. New preconditioners for KKT systems of network flow 
problems. SIAM Journal on Optimization, 14(3):894-913, 2004. 



41 



[Fie73] Miroslav Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical 
Journal, 23(98) :298-305, 1973. 

[Geo73] J. A. George. Nested dissection of a regular finite element mesh. SIAM J. Numer. 
Anal, 10:345-363, 1973. 

[GMZ95] Keith D. Gremban, Gary L. Miller, and Marco Zagha. Performance evaluation of a 
new parallel preconditioner. In Proceedings of the 9th International Symposium on 
Parallel Processing, pages 65-69. IEEE Computer Society, 1995. 

[G088] G. H. Golub and M. Overton. The convergence of inexact Chebychev and Richardson 
iterative methods for solving linear systems. Numerische Mathematik, 53:571-594, 
1988. 

[Gre96] Keith Gremban. Combinatorial Preconditioners for Sparse, Symmetric, Diagonally 
Dominant Linear Systems. PhD thesis, Carnegie Mellon University, CMU-CS-96- 
123, 1996. 

[GT87] John R. Gilbert and Robert Endre Tarjan. The analysis of a nested dissection 
algorithm. Numerische Mathematik, 50(4):377-404, February 1987. 

[Har72] Frank Harary. Graph Theory. Addison- Wesley, 1972. 

[Jos97] Anil Joshi. Topics in Optimization and Sparse Linear Systems. PhD thesis, UIUC, 
1997. 

[KM07] Ioannis Koutis and Gary L. Miller. A linear work, o(n 1 / 6 ) time, parallel algorithm 
for solving planar Laplacians. In Proceedings of the 18th Annual ACM-SIAM Sym- 
posium on Discrete Algorithms, pages 1002-1011, 2007. 

[LRT79] Richard J. Lipton, Donald J. Rose, and Robert Endre Tarjan. Generalized nested 
dissection. SIAM Journal on Numerical Analysis, 16(2):346-358, April 1979. 

[Mih89] Milena Mihail. Conductance and convergence of Markov chains — A combinatorial 
treatment of expanders. In 30th Annual IEEE Symposium on Foundations of Com- 
puter Science, pages 526-531, 1989. 

[MMP + 05] Bruce M. Maggs, Gary L. Miller, Ojas Parekh, R. Ravi, and Shan Leung Maverick 
Woo. Finding effective support-tree preconditioners. In Proceedings of the seven- 
teenth annual A CM symposium on Parallelism in algorithms and architectures, pages 
176-185, 2005. 

[Rei98] John Reif. Efficient approximate solution of sparse linear systems. Computers and 
Mathematics with Applications, 36(9):37-58, 1998. 

[SS08] Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resis- 
tances. In Proceedings of the 40th annual ACM Symposium on Theory of Computing, 
pages 563-568, 2008. 



42 



[SST06] A. Sankar, D. A. Spielman, and S.-H. Teng. Smoothed analysis of the condition 
numbers and growth factors of matrices. SIAM Journal on Matrix Analysis and 
Applications, 28(2):446-476, 2006. 

[ST04] Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph 
partitioning, graph sparsification, and solving linear systems. In Proceedings of the 
thirty-sixth annual ACM Symposium on Theory of Computing, pages 81-90, 2004. 
Full version available at http://arxiv.org/abs/cs.DS/0310051. 

[ST08a] Gil Shklarski and Sivan Toledo. Rigidity in finite-element matrices: Sufficient con- 
ditions for the rigidity of structures and substructures. SIAM Journal on Matrix 
Analysis and Applications, 30(1):7~40, 2008. 

[ST08b] Daniel A. Spielman and Shang-Hua Teng. A local clustering algorithm for mas- 
sive graphs and its application to nearly-linear time graph partitioning. CoRR, 
abs/0809.3232, 2008. Available at http://arxiv.org/abs/0809.3232. 

[ST08c] Daniel A. Spielman and Shang-Hua Teng. Spectral sparsification of graphs. CoRR, 
abs/0808.4134, 2008. Available at http://arxiv.org/abs/0808.4134. 

[Str86] Gilbert Strang. Introduction to Applied Mathematics. Wellesley- Cambridge Press, 
1986. 

[SW09] Daniel A. Spielman and Jaeoh Woo. A note on preconditioning by 
low-stretch spanning trees. CoRR, abs/0903.2816, 2009. Available at 
http : //arxiv . org/abs/0903 . 2816. 

[TB97] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, Philadelphia, PA, 
1997. 

[Vai90] Pravin M. Vaidya. Solving linear equations with symmetric diagonally dominant 
matrices by constructing good preconditioners. Unpublished manuscript UIUC 1990. 
A talk based on the manuscript was presented at the IMA Workshop on Graph 
Theory and Sparse Matrix Computation, October 1991, Minneapolis., 1990. 

[ZBL+03] Dengyong Zhou, Olivier Bousquet, Thomas Navin Lai, Jason Weston, and Bernhard 
Scholkopf. Learning with local and global consistency. In Adv. in Neural Inf. Proc. 
Sys. 16, pages 321-328, 2003. 

[ZGL03] Xiaojin Zhu, Zoubin Ghahramani, and John D. Lafferty. Semi-supervised learning 
using Gaussian fields and harmonic functions. In Proc. 20th Int. Conf. on Mach. 
Learn., 2003. 



A Gremban's reduction 

Gremban |Gre96| (see also [MMP + 05] ) provides the following method for handling positive off- 
diagonal entries. If A is a SDDo-matrix, then Gremban decomposes A into D + A n + A p , where 
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D is the diagonal of A, A n is the matrix containing all the negative off-diagonal entries of A, 
and A p contains all the positive off-diagonals. Gremban then considers the linear system 



A 



X2 



where 



D + A n 

-An 



—Ap 
D + A n 



and b 



and observes that x = (x\ — X2j/2 will be the solution to Ax = b, if a solution exists. Moreover, 
approximate solutions of Gremban's system yield approximate solutions of the original: 



X 2 



A ] b 


< e 


A ] b 









implies 



x 



C e 



A ] b 



where again x = (x\ — X2)/2. Thus we may reduce the problem of solving a linear system in a 
SDDo-matrix into that of solving a linear system in a SDDMo-matrix that is at most twice as 
large and has at most twice as many non-zero entries. 



B Computing the stretch 

We now show that given a weighted graph G = (V,E) and a spanning tree T of G, we can 
compute str(e) for every edge e G E in 0{{m + n) logn) time, where m = \E\ and n = \ V\. 

For each pair of vertices u,v G V, let resistance(u, v ) be the resistance of T(u,v), the path 
in T connecting u and v. We first observe that for an arbitrary r G V, we can compute 
resistance(u, r) for all v G V in 0(n) time by a top-down traversal on the rooted tree obtained 
from T with root r. Using this information, we can compute the stretch of all edg es in E T — 
{edges e G E such that r G T(e)} in time 0(\E r \). We can then use tree splitters in the same 
manner as in TreeUltraSparsif y to compute the stretch of all edges in E in 0((m + n) logn) 
time. 



C Decomposing Trees 

The pseudo-code for for decompose appears on the next page. The algorithm performs a depth- 
first traversal of the tree, greedily forming sets Wi once they are attached to a sufficient number 
of edges of E. While these sets are being created, the edges they are responsible for are stored 
in F su b, and the sum of the value of r\ on these edges is stored in w su b- When a set Wi is formed, 
the edges e for which p(e) = Wi are set to some combination of F^ and F v . 

We assume that some vertex r has been chosen to be the root of the tree. This choice is 
used to determine which nodes in the tree are children of each other. 

Proof of Theorem \9.3l As algorithm decompose traverses the tree T once and visits each edge 
in E once, it runs in linear time. 

In our proof, we will say that an edge e is assigned to a set Wj if Wj G p(e). To prove part 
(a) of the theorem, we use the following observations: If Wj is formed in step 3.c.ii or step 6.b, 
then the sum of r] over edges assigned to Wj is at least <f>, and if Wj is formed in step 7.b, then 
the sum of r\ of edges incident to Wj and WV+i (which is a singleton) is at least 2<p. Finally, 
if a set Wh is formed in line 5.b of decompose, then the sum of r\ over edges assigned to Wh is 
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{{W x , ...,W h },p) = decompose^, E, V , t) 

Comment: h, p, and the W^s are treated as global variables. 

1. Set h = 0. 

2. For all e£ E, set p(e) = 0. 

3. Set = 2£ e6J5 f7(e)/t. 

4. (F,w,U) = sub(r). 

5. If [/ / 0, 

(a) h = h + 1. 

(b) Wft = 17. 

(c) For all e £ F, set p(e) = p(e) U {W^}. 



(F,w,U) = sub(v) 

1. Let ui, . . . ,u s be the children of v. 

2. Set w sub = 0, F su6 = and U sub = 0. 

3. For i = 1, . . . , s 

(a) (Fi,Wi,Ui) = sub(«t). 

(b) w s „j = u; su & + F sufe = F sub U Fj, = U sub U C7. 

(c) If w sub > <p, 

i. h = h + l. 

ii. Set W h = C/ su6 U{u}. 

hi. For all e G F sub , set p(e) = p(e) U {W/J. 
iv. Set w sub = 0, F sub = and U sub = 0. 

4. Set F v = {(u,v) G the edges attached to u. 

5. Set «;„ = Y,eeF v v(e)- 

6. If < w v + w SM fe < 20, 

(a) h = h + l. 

(b) SetW ft = Z7„ l6 U{t;}. 

(c) For all e G F sm6 U F„, set p(e) = p(e) U {W^}. 

(d) Return (0,0,0). 

7. If uj„ + w sub > 2(f), 

(a) h = h + l. 

(b) Set W fc = £7^. 

(c) For all e G F sub , set p(e) = p(e) U {W h }. 

(d) /i = /i + l. 

(e) Set W h = {v}. 

(f) For all e G F„, set p(e) = p(e) U {W h }. 

(g) Return (0,0,0). 

8. Return (F sufe U F tI , w sufe + w v , U sub U {i;}) 
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greater than zero. But, at most one set is formed this way. As each edge is assigned to at most 
two sets in W\ , ■ ■ ■ , Wh , we may conclude 



2j2v{e)>(h-l)tj>, 

which implies t > h — 1. As both t and h are integers, this implies t > h. 

We now prove part (6). First, observe that steps 6 and 7 guarantee that when a call to 
sub(-u) returns a triple (F,w, U), 

w = ^2 r )(e) < <t>- 

e£U 

Thus, when a set Wh is formed in step 3.c.ii, we know that the sum of rj over edges assigned to 
Wh equals w su b and is at most 2(j>. Similarly, we may reason that w S ub < <j> at step 4. If a set 
Wh is formed in step 6.b, the sum of r\ over edges associated with Wh is w v + w su b, and must 
be at most 2(f). If a set Wh is formed in step 7.b, the sum of rj over edges associated with Wh is 
w su b, which we established is at most (p. As the set formed in step 7.e is a singleton, we do not 
need to bound the sum of rj over its associated edges. □ 

Lemma C.l. Suppose G = (V,E) is a planar graph, ir is a planar embedding of G, T is a 
spanning tree of G, and t > 1 is an integer. Let ({Wi, . . . , Wh} , p) = decompose(T, E, rj, t) with 
the assumption that in Step 1 of sub, the children v±, . . . ,v s ofv always appear in clock-wise order 
according to it. Then the graph GWi,...,W) l } = ({!> ■ ■■ ,h} : 3 e S E,p{e) = {Wi, Wj}}) 

is planar. 

Proof. Recall that the contraction of an edge e = (u, v) in a planar graph G = (V, E) defines 
a new graph (V — {u} , E U {(x, v) : (x, u) G E} — {(x, u) G E}). Also recall that edge deletions 
and edge contractions preserve planarity. 

We first prove the lemma in the special case in which the sets W\, . . . , Wh are disjoint. For 
each j, let Tj be the graph induced on T by Wj. As each Tj is connected, G{ Wl ^ Wh y is a 
subgraph of the graph obtained by contracting all the edges in each subgraph Tj. Thus in this 
special case G{ Wlt Wh y is planar. 

We now analyze the general case, recalling that the sets W\, . . . , Wh can overlap. However, 
the only way sets Wj and W^ with j < k can overlap is if the set Wj was formed at step 3.c.ii, 
and the vertex v becomes part of Wk after it is returned by a call to sub. In this situation, no 
edge is assigned to Wj for having v as an end-point. That is, the only edges of form (x,v) that 
can be assigned to Wj must have x G Wj. So, these edges will not appear in Gr Wl >Wh y. 

Accordingly, for each j we define 




Wj — v if Wj was formed at step 3.c.ii, and 
Wj otherwise. 



We have shown that Gr Wl jWh \ = Gi Xl , ...,X h \- Moreover, the sets X\, . . . ,Xh are disjoint. Our 
proof would now be finished, if only each subgraph of G induced by a set Xj were connected. 
While this is not necessarily the case, we can make it the case by adding edges to E. 

The only way the subgraph of G induced on a set Xj can fail to be connected is if Wj is 
formed at line 3.c.ii from the union of v with a collection sets Ui for < i < i\ returned by 
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recursive calls to sub. Now, consider what happens if we add edges of the form (vi, Ui+i) to the 
graph for iq < i < i±, whenever they are not already present. As the vertices % appear 

in clock-wise order around v, the addition of these edges preserves the planarity of the graph. 
Moreover, their addition makes the induced subgraphs on each set Xj connected, so we may 
conclude that Gix t x h } 1S i n f ac t planar. □ 

D The Pseudo-Inverse of a Factored Symmetric Matrix 

We recall that £?' is the pseudo-inverse of B if and only if it satisfies 

BB ] B = B (41) 

B ] BB ] = B ] (42) 

(BB*) T = BB^ (43) 

(B ] B) T = B ] B. (44) 

We now prove that if B = XCX T , where X is a non-singular matrix and C is symmetric, 
then 

B ] = 1IX~ T C^X~ X U, 

where II is the projection onto the span of B. We will prove that by showing that IiX~ T X^ 1 !! 
satisfies axioms (j4lti44|) . Recall that II = BB^ = B^B and that UB = B. 
To verify (|4ip . we compute 

B(JlX- T C ] X- l Il)B = BX~ T C ] X~ l B 

= {XCX T )X~ T C ] X~ 1 {XCX T ) 
= XCC ] CX T 
= XCX T 
= B. 

To verify (j4*2j) . we compute 

(nx- T c t x^ 1 n)5(nx^ T c t x^ 1 n) = ux~ t c^x- 1 bx^ t c^x- 1 u 

= ux- T c^x- 1 xcx T x- T c^x- 1 u 
= nx- T tfctfx- l n 
= iLX~ T c t x~ 1 n. 
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To verify (|43p . it suffices to verify that BUX T C^X 1 T1 is symmetric, which we now do: 
B(ILX'~ T C rt A'~ 1 II) = UBX^C^X^U 

= u(xcx T )(x- T c^x- 1 u) 
= uxcc^x^u 

= BtBXCCf<X- 1 BBt 

= B^XCX T XCC ] X~ 1 XCX T B^ 

= B^XCX T XCC ] CX T B^ 

= B ] XCX T XCX T B ] 

= B ] BBB\ 

which is symmetric as B is symmetric. 

As B and T1X~ T C' X -1 !! are symmetric, it follows that (|44p is satisfied as well. 
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